The Unreasonable Effectiveness of Statistical AI

Session 5: Foundational Aspects of General Intelligence and AI

1:30 pm- 2:15 pm GMT

Keynote Talk

Hector Zenil (Oxford and Alan Turing Institute) – The Unreasonable Effectiveness of Statistical AI

To watch the video on YouTube, click the ‘YouTube’ button above.

Transcript: –

Sheri Markose

So it gives me great pleasure to open this session on the foundational aspects of intelligence and AI. For some of you, this is almost like the rematch of a webinar we had on the fifth of November. With different characters now a different twist to the arguments. But Hector, Hector’s got this phenomenal title from a Vigna himself about the unreasonable effectiveness of statistical AI. It’s a theme that I picked up on, you know, when I gave the introduction to this conference about the unreasonable effectiveness of AI itself. And so, Hector is a self about natural scientist, and he calls Gregory chitin, of having said this of himself, a new kind of practitioner theoretician. I support that, I mean, it’s a new kind of practitioner that we need, that combines both practice and theory. And of course, his supporting statistically I not from somebody who is a card carrying member of that Qaeda. In fact, he says somewhere that he surprised that glorified curve fitting can do as well as it does, in some sites that in some sense, that reflects my own view. He’s the director of the Oxford immune algorithm algorithmics, and senior researcher at the Turing Institute, to Hector The floor is yours.

Hector Zenil

Thank you very much, Sheri, I’m hoping that sound is fine. Because I’m kind of in a public space . I flew a couple of days ago, and I booked a meeting room, I’m hoping, in the middle of my talk, I’m going to be moving to that place. You raised your hand to show that you want to attend.

Sheri Markose

Sorry

Hector Zenil

that’s okay. That’s fine. So first of all, thank you very much, Sheri, for the invitation, it is a pleasure. And indeed, I found this title. Very interesting, because I’ve been thinking about this for a while. And I think there’s two aspects to this, I think there’s indeed, an element of success. But there’s also some sort of illusion. And I’m going to try to cover some of these aspects. The talk is not not going to be completely technical, because I know we come all from different backgrounds. So the idea is to give some some sort of overall overview of what we are doing. And some of you may be more or less familiar with some of the things I’m going to be covering. So one of these things that basically are known are some of the limitations and challenges of statistical artificial intelligence, including deep learning, things like scalability. So we require a lot, a lot of data to train these methods. There’s also a problem of, if you wish, understandability. So statistical machine learning, deals very poorly with things like reasoning, modeling, inference, obstruction, and those kinds of things that we are more used on the computational symbolic. Side, and also the way in which we think human humans reason. Another one is explained ability. So even when you have some sort of computer model, from statistical machine learning, when you open it, there’s no correspondence state state to state correspondence with anything from the physical world, there is just for example, a matrix of numbers that make very little sense. So we call them black boxes, and for that reason, and then there’s also the problem of bias, which I think is a slightly different nature of existing bias comes from the way in which we train these networks, and even symbolic computation can carry this kind of bias. Basically, we considered humans grading these approaches. But actually, I think the kind of AI that we are doing is less prone to bias by definition, and I’m hoping we are going to be able to, or I’m going to be able to describe it even briefly. So one of the many challenges in statistical machine learning is what I call deep fragility. And this is an example from a few years ago, but I have another one that is very recently so this hasn’t been yet fully at rest, we have some examples of deep neural networks being trained on images. So convolutional ones that get things that for all humans are evidently very wrong. The first one is not that surprising, because actually these people lined up in this way actually look like a cornfield. Yeah, so let’s give that one. But the blue shoes, for example, is completely wrong, for us doesn’t look anything like shoes. The allergic to airplane is also completely wrong. This is a unicorn. And then we are also familiar with a kind of single pixel attacks. For example, when you have actually the deep neural network getting it right, but then you just flip a single pixel, with very high confidence, the neural network tells you that it is something completely different. So from a bird that coded right, it changes to a frog. In the example I have on the screen or dog to a to a cat, for example. And a few years ago, the ganz generative adversarial neural networks were invented with a purpose of making these systems more robust to this kind of attacks. And they work pretty well as long as you don’t completely move away from the kind of data distribution that you had before. So here’s some sort of an artificial hack, because we are not really addressing the, the the main cause for this kind of strange behavior in statistical machine learning.

So this is a more recent example. So just in case that you thought that maybe this behavior was because this is quite old, now it keeps happening. And it won’t change because we are not addressing the root causes. So still, you can, even with more high resolution images, and even better neural network architectures, you can still do these kinds of attacks, flipping pixels, and doing things like this, for example, combining a little bit of noise with the original image. And these neural network, for example, says that all these images are school bosses. So again, that is very strange, because for us, it looks almost exactly like a semi mesh, but then you combine it with some noise, and the neural networks get it completely wrong. And some more sophisticated like this one. So you may have a neural network very well trained on school busses, but then you start moving the school busses around, and the neural network get it completely wrong. And this is basically an indication that neural networks understand things very differently to a human being right, because a human being you, even when you’re perhaps a child or even a baby, you move this thing around and it is still the same thing, we are very, we can easily identify the same object, even if we move it around. The same instead of moving the object, you can move the camera and the angle. Again, neuron networks would get it wrong, because they are basically being trained on very specific angles. And they do very poorly with changing the distribution or the the context. So they are very sensitive context. This is another interesting one, where you have the neural network, identifying the object in the right way independently, so the monkey and the motorcycle, for example. But then you put them together or you overlap them. And the neural network says with high confidence that these objects are completely different. And again, that wouldn’t happen for a human being. So even if you want to defend how neural networks are different to humans, it is not what we would consider as something intelligent in terms of human beings. One key thing is that no matter how much data you get, to retrain your network level you you want ever solve these kind of problems, because you cannot keep feeding the neural network with an infinite number of images, all possible angles, all possible movements. There’s no way so this is a perfect example of how big data doesn’t solve everything. And I think actually small data is what fixes these problems. By small data. I mean, coming up with Model Driven approaches, some AI that actually abstracts the most important features of a school buss, for example, which may maybe is related to the it’s function or what you can put inside or that it can carry people that then you can actually identify this object as a school boss, no matter how you move it around. This is one of my favorite examples and I’m sure most of you may be familiar with. And there’s no reference because this is already well known in the literature. So if you train a neural network, for example, to identify and tell apart cats from dogs, and then you start looking at what the neural network actually speaking up, it turns out that this high level of accuracy comes from taking pictures of dogs outdoors. So it is actually picking up the green of the grass, for example. And most cats are, most cat pictures are indoors. And that’s the kind of thing that is happening all the time where basically classifying things for the wrong reasons, and we don’t even notice it. So we are fooling ourselves. So this kind of success at classifying objects, for example, is very misleading.

And I think we have become lazy thinkers and data holders. And you can see these, for example, with companies fighting for who has driven their cars longer. So Google comes out and say, We have trained our cars on 20 million miles and then comes Tesla say, now we have driven for 30 million miles. And we are measuring the wrong thing. Because for, at least for me intelligent would be, but how, how much you can learn from actually driving use the fewest number of miles, actually. So this is a personal anecdote, but actually, I started driving myself when I was 16 years old. And that was by stealing my dad’s car. And I’m not saying the right thing. But actually, I learned from just watching my dad driving and they I would steal the car and start driving myself in a couple of hours. So for me s, I’m not saying I was more intelligent, but human intelligence is that when we see someone else doing it, and then almost we immediately pick it up. This approach from self driving companies is almost like bragging about your kids studying at high school for 20 years, or 100 years. And it is definitely the wrong approach to what we call human intelligence. So human intelligence is more about finding the shortcuts, so being lazy in in actually driving 20 million miles with your car. So driving the least the least possible. So definitely current current traditional AI does doesn’t think or learn like us. And this is not a compliment. And the consequences consequences are quite surprising that we see AI in action. And he doesn’t get it right, in some situations that are very obvious for us. So this was in Tokyo, for example, in 2021, the self driving small buss driving over an athlete, which wouldn’t have happened to us because these cars actually drive at 10 kilometers per hour or something. So they are very, very slow. So it is kind of surprising how in such a developed country where this country’s leading AI and robotics, things can go this wrong. Now we can see it also with the large companies with Alexa and Siri for example, that barely understand what we are saying. And if you change anything, even you’re saying for example, light instead of instead of lamps, so turn not turn on my lamps for example. It get it right, and it can cut this is kind of a self criticism, by the way, because I was kind of involved in early Alexa and Siri. Fortunately, I can say that for what we did. It works pretty well it is for answering factual questions. But I’m still surprised how behind they are when we compare them to human understanding. And the same when we apply things like appear to be very sophisticated that such as Watson in the Jeopardy game, for example, when they tried to apply Watson to healthcare for now, perhaps two decades, and they failed, epically. And now they are actually trying to sell Watson to whoever wants to buy the technology for a very small fraction of what it actually cost. And that is because there’s this problem of translating something that appear to be very good at something and then you want to apply it to some real world situation such as healthcare and it completely fails. Still deep neural networks and don’t get me wrong. I think they they are pretty good at what they do. They classify things very well as we know They are also very good at representing things, which I think is a very good advancement. So they represent images, for example, in numerical form, and then we can compute with them. But it’s also clear now that they

do things very differently to humans. And this is another very simple example. So to make a new, pure neural network to understand the concept of positional number system, for example, it will be very difficult. So one would need to come up with some sort of very specific architecture to deal with a positional number system. And there’s no way that the neuron level convolution, and they will, they will, for example, would understand that only by just looking at any number of images of numbers. The other example is, for example, try to teach neural networks to do basic arithmetic, it will be also impossible. This is another example of how, no matter how much big data you want to train on, you will never achieve this purpose, right, because you will need to show the neural network all possible arithmetic operations. And as we know, there are an infinite number so that it wouldn’t be possible to train the neural network to begin with. So obviously, there’s something that we have to fix on what we currently call AR, or AI or machine learning, or as I call it, statistical AI. Because actually, we were pretty good at doing things like arithmetic, we have these calculators and computers are supposed to excel at these kind of things. So it looks like the solution is basically to start combining these two things. And this is what some research groups are already doing. I think we haven’t done much progress, but there are some interesting suggestions. And my group, for example, is is trying to make progress on this task of combining the best of both worlds. So this is another example of how I think AI. Currently, statistically AI is kind of misleading. People that believing that they are doing something that perhaps they are not doing this is related, for example, to Alpha fall to which we are attributing solving the problem of protein folding. What I think is that, for example, if alpha alpha for two years, there was still way too much human intervention to attribute the success of the challenge to AI alone. And the human intervention came from several sources. One was that the choice of the architecture and the training itself, but also the interpretation. So when we think that alpha fall to solve the problem of protein folding, there was also a lot of interpretation in the pipeline. And this is an example of traditional statistics, providing some insight at how or why cars on the road, kind of cluster in a small patches. So if you measure the time between any two cars, you would see that they distribute into a Poisson distribution. But the description is not really giving you the cost. And if you really want to extract or infer what is going on why cars are distributed in this way, you have to do some sort of postdoc interpretability, where human scientists are actually making the call. And the reason is very simple. There’s always a slower car in front, right. So the the reason for this distribution is simply because cars behind cannot overtake and you can, you can break this distribution, by the way, just by adding lanes, for example. So if cars cars are able to overtake each other, then you destroy the distribution. But again, there’s a external perturbation and experiment be signed by a human scientist that is allowing to do this kind of interpretation. And that’s what I think happened without 542, for example.

So here are some possible interpretations of what happened without a photo because I was quite surprised. And I guess everybody was because we have been doing molecular dynamics for now. decades. And molecular dynamics is very much physics driven. So we we have FameBit, we have created some sort of simulation platforms where there’s a lot of physics laws, intervening, so like thermodynamics Brownian motion, at the atomic level, in order to come up with partial solutions to protein folding. So the question was how come than a completely statistical system like alpha fall to would solve this challenge when it is so much physics driven. And when first interpretation that I came up with which now I realized that what I’m hoping is wrong, I thought, well, maybe the world is much more statistical than we thought, maybe all these physical laws are kind of concealing the statistical nature of the world. But I think, fortunately for us, and even science, I think it is not the case. And what I think is now happening with some evidence coming from papers like this one that I’m citing here at the bottom is that actually, what we have ended up doing is that we encoded all those physical physical laws, and the mapping between strings, amino acid, the strings on 3D structures, we have done so much work, that it appears in some ways statistical even when it is not. When we have a statistical AI approach, training on those training sets, then we have the situation in which it appears to be solved, only using the statistics, but there’s a lot of non statistical work behind. But obviously, there are some open questions as to whether for example, this is actually a combination of both statistical patterns and physics driven models. And I think there’s still a lot a long way to understand what happened with our fault, too. But in general, I think we’re doing pretty good at finding ways to accelerate science either by having statistical AI combined with human scientists. Eventually, we are hoping that we can even have some sort of non statistical AI that can actually close the experimental cycle. And that’s what we’re doing at the Alan Turing Institute, for example. But here on the screen are basically some examples of how combining the two approaches is making progress in science. While on the other hand, we should not let mislead ourselves from these statistical AI successes. Because what we have is that, for example, in the case case of alpha four, two, there’s no much that we can ask directly to molecular biology textbooks, only by looking at the results. There has to be some human scientists interpretation to actually add new knowledge to textbooks from the successes of fall two. So I think this diagram illustrates some of the situation. So I think internet and the state of data analytics in the 90s, for example, bring us a lot of data. But current AI is mostly good at classifying things. So it’s kind of still in the information age. And what we are really trying to do is to move towards knowledge. So for talks to extract knowledge from the successes of statistical AI, we still have humans in the loop. And it is accelerating science and knowledge. But then really, to move to the next levels, we need some different ai. And hopefully, we are not going to be regressing as we are, as humans, some sort of going backwards to conspiracy theories. And so we actually have to work our way to get away from getting that far in some way. So these are very nice quotation from Judea, Pearl saying and we completely resonate with this to really build intelligent machines we have to teach AI cause and effect and I definitely agree with this quotation.

And this is our own research.

So this is what we have found with our own research, but even even average you humans think much more as human scientists. One very easy example, to see how this works is think of how we learn, for example, phone numbers, right? When you give your phone number to someone else, we are not only looking for statistical patterns to kind of memorize it. But we are also looking for kind of algorithmic patterns. So for example, if the phone number were something like 123456, it is very easy for us to remember because we know that basically the successor function, or if you notice, actually, there’s no statistical parents in such a sequence, because it is boreal normal. So it’s maximal entropy. So definitely human minds think more algorithmically or on top of a statistically, they also think algorithmically. And this is a very interesting experiment that we performed, it is kind of an inverse during the test, because we are asked people to produce randomness according to themselves. And we compared those answers against the best kind of randomness that computers can produce, which under some strong version of the during church thesis, is basically about best randomness that can be produced. So that is algorithmic randomness. So the mathematical definition of randomness, we found that I actually, the answers given by humans, were much closer to what we consider algorithmic randomness than to statistical randomness. So we have the evidence that actually we should be moving towards this algorithmic basis. And this is what we have been doing recently. So some, somehow we can attribute to the field of symbolic learning. And this is a selection of papers. So we have been applying this kind of research that I’m going to briefly explain in the next slide or so to different areas of science, including genetic networks, challenges as nucleosome positioning, for example, which is considered to be the second greatest challenge after protein folding. Nature released a very nice video of our research that I would highly recommend you to watch if you want to know a little bit more about the details. And they did this by their own initiative. But I think they did a great job. And we call our approach causal, the convolution. And what we do is basically to find the computer programs behind the computable models that are able to explain the data. And you can see how this is related to Kolmogorov complexity. But unlike Kolmogorov, complexity, or algorithmic complexity, we are not necessarily interested in the shortest computer programs, we are interested in the set of computer models that can explain that a piece of data. And obviously we can rank those models by how large or small they are. And according to algorithmic probability, we can rank them and assign them Seiger likeliness. To the shorter models, but we are not only interested in shorter model, so we see somehow algorithmic complexity as a much richer scientific research field in the context of causality, for example. So here’s the example I was giving before. So when you have the sequence of natural numbers, if you wanted to characterize the sequence using traditional statistics, you wouldn’t find any pattern because as I said, it is boring normal so there is no over representation of any sub sequence.

So we’ve our models, actually, we are able to characterize these kind of sequences because there’s obviously a computable model behind this. There’s a computer program that is able to produce this sequence. And what we do is this, so what we do is to explore the space of possible computation computable models, that is basically computer programs that are able to produce that sequence, for example, the sequence because it’s very simple. There are many computer programs that can produce it. And that give us an insight of the kind of model and actually even more than itself. So here is an example of how we can characterize that sequence. So we find a binary counter during machine that is able to produce this sequence in binary. And then we have a computer program of fixed size that is able to translate this sequence into a sequence of natural numbers. And the very nice feature of this approach is that we actually have access to those computer programs. And we can analyze the state diagram. So you can see how this is a completely white box instead of a black box. So you open the description, and you have a state space description of your data, instead of just a matrix of numbers that don’t represent anything in the real world. We have, we have proven that obviously, you can do some progress with this. But obviously, this is computationally very expensive. But we have some methods in which we break down a piece of data. And we can, we can find small description descriptions, computable descriptions for each piece of that data, and then put it together. And within this approach, would very nicely complement statistical approaches. Because what we are trying to do is kind of emulate how human science scientists think or you think, right, because, again, science is all about finding these descriptions. So when we do science, where we want some sort of rule or law, that is able to possibly explain a piece of data, instead of a completely statistical district description. And we think that statistical AI is going to continue being very useful in some tasks, such as, for example, taxonomic classification, there is an obvious one for statistical AI, or in some others, and one is, for example, medicine, we believe that there has to be some fundamental changes in how we are approaching with AI, because neural net was will fail. And we can see for example, with Watson, which by the way, was not a neural network, Northwest completely statistical approach will fail at actually doing something amazing or anything meaningful in, in healthcare, medicine beyond just classifying things, like classifying tumors, or those kind of things. But finding causal explanations they are by designed, ill equipped for scientific discovery, this kind of AI will be fundamental, again, neural networks, deep learning and statistical machine learning would be unsuitable for really closing the experimental loop and come up with new discoveries by their own with no human intervention intervention, again, with human intervention, everything is possible, because that depends how much you put on the human side versus the AI side. And actually, we are also and I’m going to explain in a few slides, we are coming up with ideas on how to classify or rank the level of contribution of AI in scientific discovery versus for example, having a human scientist in the in the loop. So humans are very good at these things. And a current AI has a very hard time trying to reproduce these things. Like for example, understanding human literature, and I know this is an unfair advantage, right? Because we are talking about the context, the human context. So obviously, we’re pretty good at that. human scientist, or so we are really good at picking interesting problems. And again, this is completely unfair, right? But that’s what we want AI for, we want AI to pick the right problems to solve the right problems are basically problems that we we are as humans are interested in. We are also or human scientists are very good at abstract, obstructing salient features.

And finding first principles. So there is the whole thing about science, finding rules, laws, from big and also small data. We’re also very good at producing mental models of the real world, something that current statistically is pretty bad. And then again, an unfair advantage is that we are very good at communicating results. But we also want AI to be very good at communicating results to make progress. And so we’re working. We are working on at the Alan Turing initiative is what we call the novel Turing grand challenge. And the idea is to Set the milestones that would be necessary to achieve in order to eventually have some sort of AI independent of humans, able to do scientific discoveries, word of novel price, for example, not necessarily the novel price where you’re seeing just the novel price idea for these purposes, but we are aware that obviously, there’s a lot of research that hasn’t been given a novel price, and there is still asking us worth or if not more than the normal price. And there has been some progress, but we believe that much more progress is still needed to be done, we have a workshop in 2020, we are having another one next month, and I will be happy to forward the invitation for all of you to attend and participate, we will be very happy to help you as well. And we think this kind of AI is fundamental to accelerate science, because we are producing more and more papers and exponential speed. And it is also true that we are producing scientists, also at an exponential growth rate. But the problem is that every any any single scientist has a limited attention and retention retention span. So there are still open questions on whether scientists require for example, minimum knowledge to make non incremental discoveries. And mathematically, you can, based on a few assumptions prove that there has to be a threshold, right? Because if the amount of science produced continues growing, there’s going to be some limit at which an individual is going to see themselves surpassed. So we think we need these kinds of AI to help humans scientists in the future.

Sheri Markose

Can I jump in there, you’ll have about one minute to wrap up so that we can take at least a few questions before we go to the general discussion in the other talks.

Hector Zenil

Absolutely. So I think there’s only a couple of more slides. So what we’re doing also at the Alan Turing Institute is to try to come up with a general overview of the state of AI. And you can see these there, and we were trying to put things together and then try to map every one of these fields to whatever scientific field may be relevant to or not. And these are some of the milestones, but I’m not going to go through them. But basically, what we’re trying to do is to say, Okay, this doesn’t have to be the progression because it is very difficult to predict. But this may happen before actually, for example, an AI is able to win a Nobel Prize, or produce full scientific article, and so on. And the other thing I mentioned briefly, and we obviously don’t have time anymore, is the evaluation. So how much responsibility in the discovery cycle we are transferring to machine versus human and alpha fall to, for example, is more on the human side right now that the machines are, in our view. And we are trying to come up with some sort of evaluation similar to self driving cars, where we can say, actually, this experiment, or this discovery had absolutely no automation, or it’s only level one where we do some machinery assistance, or is partial automation and those kinds of things. This is a work in progress. We are having a workshop, but also publication in the next few months. And we we are going to be happy to release and discuss all these things. But I’m going to stop there. Thank you very much.

Sheri Markose

Thank you very much. It’s good to hear skeptical sort of view, rather than overwhelming, you know, acclamation of AI, as it stands, and statistical AI, and all that. But interestingly, you know, in some situations, maybe you do not need causal models, you know, you’re I’m playing devil’s advocate, there will be people here, who would want to pile on on that. And I have to say your, your examples of how algorithms I myself don’t think that the brain works. We don’t do statistical modeling in the brain, but it is algorithmic but not in the way that you have given us an example. So obviously, I’ll be talking about how I think the brain utilizes algorithmic methods of cognition and logic. So let me open it to a few questions right now and of course, we move into a more general discussion after the two speakers have spoken Karl and Vincent. So can I have some raised hands, so you can jump in in those who are already here as panelists . Karl?

Karl Friston

So both in the presentations, which was very intriguing. And the Sheri’s question there was this fundamental distinction between statistical and algorithmic approaches. Is that fair in the sense that another way of carving those two types of models or algorithms will be into continuous state space models versus discrete state space models, even cast as a sort of quantum system bits, as opposed to real numbers? What if you had a, an algorithm that was statistical that operated on quanta sized representations, discrete, symbolic, categorical states of the world that you can only be in one and not in all other states? Do you really need the distinction between algorithmic or statistical?

Hector Zenil

So actually, I don’t think that distinction is fundamental. I think algorithmic is a superset of statistical descriptions. So it is contained in in the algorithmic space. But you make a fascinating comment, and I don’t think algorithmic necessarily means discrete. So for example, in mathematics, we deal with differential equations. And at the end, when we end up simulating them, or perhaps even solving them, we move always to perhaps a discrete space, even when we find ourselves believe that we’re working on the continuous space. But I think it is very much symbolic. So in that way, I don’t think that they are opposed in any way. And it could be an algorithmic on the continuous space.

Karl Friston

Spoken like a true quantum physicist. Do you like category theory? That would that be something that you could you could relate to?

Hector Zenil

Yeah, absolutely. Yeah.

Sheri Markose

So can I jump in there as well, I mean, be through the period, there’s a bit of a red herring that there is a distinction between statistical analysis and, you know, algorithmic ones, right? It might appear that way, I hope to convince you differently. I mean, in statistics, the problem is, as I think I’ve mentioned, to call the the list of innumerable possible events, right, and then you attach probabilities to it, my problem has always been that list is a given? And how do you sort of advance that list? In other words, how do you exit from that list, it’s always been a problem to understand how you would attach an event that was or was not on that list, which you would attach probability, you know, this idea of radical uncertainty. You know, this has always been a problem, because any frequentists method requires you to have the innumerable lists to begin with, and then you just attach probabilities, they add up to one. So the problem is actually with this, the set of outcomes itself that you might or might not be able to enumerate. That is one of the problems. And the second thing is, of course, when when Hector says, you know, he mentioned issues of creativity, I think that’s how I read into how a scientist would approach various, you know, coming up with a new problem that was never ever thought about. Right, and you say that machines are, you know, the sort of AI that we have, at the moment, may not be able to do that. But actually, I need to sort of say that this is something that though your father having the child knows who I’m talking about very well, I mean, he says, To hell with it. I mean, you know, that is what the power of algorithms are all about AI can do that, is we combine various software. So recombination powers of generative recombinations is very good. And I can’t argue with that. The problem is not about whether you can generate all sorts of new syntactic objects that were not previously there. That’s not the problem is you will see this, which of these will then fit into your model or which which of them will not, you know, that’s where the creative elements would come in. Hector?

Hector Zenil

Yeah, yeah. I fully agree and comment I want to make all are related to the suggestion you made before. I’m not suggesting by the way that mind comes up with doing machine descriptions, by the way. So fortunately, because we have all these algorithmic equivalences, even if it is something else that we use as mental models, I think there’s a fundamental mapping with algorithmic models. So in no way I’m, I’m saying, for example, that the human mind is a Turing machine, because that’s, that’s a common objection. For example, what we are doing is exploring the computable model space, which can be implemented and instantiated by a Turing machine or something else. But is still not algorithmic in nature or symbolic in nature. Because in my example, for example, with differential equations, when I was saying that we fully kind of fooled ourselves thinking that they are in continuous space, they absolutely not. So all these math facts that we have come up with is precisely to deal with that continuous basis and do symbolic regression. So algebra can be thought in that way, the same category theory, for example, the way in which we are represented things with symbols to operate at the symbolic space, which is very much discrete.

Sheri Markose

Yeah, yeah, of course, before we move to Karl stop, there’s one more thing I am gonna press you harder about Hector is about, you know, at the curve fitting argument, you saying AI is successful as it is fitting curves. And that’s exactly right, in minimizing prediction error, right. And that could lead to Karl’s talk straight away. Because what you have there is you have some, you know, the realized outcome and you have a preconceived model or whatever. And the deviations from that is, is what you want to minimize, that is what you do in any sort of curve fitting, except Karl would say there’s a lot of deep inference, there are hidden nodes and things like that, that they can find out with the deep methods of AI at the moment. And but at the end of the day, it is still fitting because you’re minimizing prediction error. Correct. And then of course, Karl will tell us more about how successful you know, the current deep learning methods are in doing that.