**Session 4: Data Science for Interconnected Systems and Epidemics**

**11:15 am-12:00 pm ** **GMT**

Keynote

Tim Rogers (University of Bath) – **The Speed of Contagion**

Transcript: –

**Ben Etheridge**

Let’s start this fourth session with delight start as a keynote with Professor Tim Rogers, who is professor of maths, mathematics sorry, at the University of Bath . And Tim works on understanding in particular, the behavior of complicated random processes in network and spatial structures, and focuses on understanding, emergence of large scale order from these random interactions. He’s published in a vast array of illustrious outlets, including the nature journals, and the proceedings of National Academy of Sciences, amongst other journals, and he is going to talk about the speed of contagion. So, Tim, it’s over to you.

**Tim Rogers**

Thanks very much for that invitation. And thank you for the invitation to speak. Yeah, it’s a shame. We’re not in person. But I’m slowly getting used to these these online presentations. So I’m going to tell you about about some work that I’ve been doing, actually, for about five years now on predicting no, seven years now, on predicting the speed of contagions in networks. Now, I want to say up front, and most of what I’m going to discuss in these slides, I’m going to use terminology about epidemics. And epidemiology is one of the important applications. But it’s not the only thing that spreads in a network. And I know that there’s a diverse range of interests in the audience today. And so I want to suggest two other applications that people might want to be thinking about while I’m, while I’m talking. So one of those is information, whether that’s information about, indeed about an epidemic, or something spreading on social media, or information about something in finance, whatever. So the way we communicate now, very quickly through social media, there’s a speed of information spread, which is which is absolutely enormous. And it’s somehow a feature of the platform over which the information is spreading, and the social network with this embedding in. And then the other application you might want to think about is financial contagion. And I’m sure some of you have worked on this. So you know, when one institution has some kind of event that has a knock on effect for its trading partners, or people that have a stake in it, and so on, and so forth.

Okay, so, let me pass to the slide that I like to show. So this is a picture I’ve stolen it from from Wikipedia, it’s a heat map showing the spread of bubonic plague across Europe. And so each of these colored bands tells you when the plague was first appeared in these parts of Europe. And you can see this kind of stripy pattern here. So there really was a geographic spread from south to north. And you could in fact, measure the speed. And the speed was about one mile per day. And that, of course, was because the plague was transmitted by persons person contact, and people had to walk places, or they had to ride a horse, and they only went places that they needed to get to. And the cost of travel was terrible, was slow and expensive and didn’t happen very much, basically. And I’m now going to compare this to no surprises, the obvious current situation with COVID-19. So we had this outbreak that spread extremely rapidly around the world, and did so in a non spatial way. So it wasn’t the case that we had a kind of, you know, east to west kind of spread or something like that. It simply started appearing all over the place very rapidly. So this is a fundamentally different process, even though it’s the same human to human interactions that are causing the spread of a disease. So that part hasn’t changed. But travel has changed, right. And so because of that the spread of this disease is vastly quicker, and in fact, can’t really be measured, it doesn’t make sense to measure this in miles per hour. So just to illustrate what I’ve just said, this is a map of the world reconstructed from airline routes. And you can make out, you know, Central Europe, and America and so on. So you can you can draw the world just through personal travel networks. So that’s one thing that’s changed. And then the other thing that’s changed is our social networks. So back when bubonic plague was spreading, people lived in one village for most of their life, and they saw the same 100 people, pretty much that there was none of the kind of level of mixing the Modern society has now we have this hugely diverse social networks and the You can just famous six degrees of separation, you can travel around the world and in a few hops in a social network. So all of this put together has created an environment where a disease or indeed inflammation or other things can spread incredibly quickly. But how quickly how can we measure this? So said that miles per hour is the wrong kind of units to measure the sensor? What do we measure it in? And how can we go and try and make some kind of prediction, because if you’re predicting the spread of a disease in, let’s say, a wild population of animals, you might be able to make some kind of prediction in miles per hour. If you know about, you know, animal movements, animal habitats and ranges might spread. So how can we do the same thing for humans. And the key thing is that the network is the medium over which this contagion is spreading. And it’s the network that’s going to contribute some crucial factors to deciding the speed of the spread. So the work I’m going to talk about, there are three papers. As I said, going back seven years. So these are the references. You can also I’ve got my website, link at the bottom there. So you can type that in at some point if you want to look it up. So my two collaborators on this for Reimer is at King’s College London, and Sam, here was a was a PhD student with me, he now is part of the Warwick team doing the models for sage so particularly to do with the vaccination rollout. Sheri, you’ve got a question

**Sheri Markose**

Yeah, can I just ask a quick clarification? I mean, you talk about speed and miles traveled and so on. Clearly, it is just time taken for numbers to double or somesuch, isn’t it? I mean, yes, that’s also speed of contagion, how many more people are infected, or just days?

**Tim Rogers**

So I’m thinking specifically about a different measure of distance. So I’m thinking about network distance, rather than speed usually has time and distance. So it’s not just a doubling time, that would be rate. For it to be a speed, it has to have a distance. And what’s the notion of distance we want to use? And I’m going to argue that the notion of distance we should be using is network distance. And that some knowledge of the network will give you some clue as to where the infection is going to appear.

**Sheri Markose**

Okay, thank you. Yeah.

**Tim Rogers**

Right. So those are the references. So let me crack on. So the model, we need a simplified model to explore these questions of network contagion. And the model I’m going to use is the SIR model of an epidemic. So SIR, as I’m sure you’re all aware, by now stands for susceptible, infected and recovered. So here’s a tiny little piece of a network of dots are going to be people and the lines between the dots represent possible contagious contacts. In the simulation, we start with one initial infected individual, and I put the star by infected to remind me to tell you again, that this applies to other models. So it could be an individual with a piece of information that they’re spreading, or it could be some other thing that they’re spreading in the network. So start with one initial spreader, and and it can spread along the edges. So this infection may be passed along the edges of the network. And there’s some kind of a delay. So it’s not the case that you instantaneously get sick, if one of your social contacts get sick, there’s an incubation period of the disease, then you have to meet them. And then there’s a you know, a process by which you spend some time with them. So after some delay, you might become infectious yourself. Now that delays is random, it’s random in the sense that it’s unpredictable. So we model it as a random variable, big X, and it has a density function that less infected nodes can recover. And we model this by allowing big X to be infinity. So sometimes, you the delay between one person getting infected and me getting infected is infinite, because they don’t manage to achieve infectious contact with me. So if the total integral of this density is less than one, then that’s saying that some of these contacts are simply never going to be infectious, because that person recovers before they infect me. So maybe another infection takes place, maybe another recovery. And now the disease can’t progress anymore, because even though they’re still active infections in the network, there isn’t a clear path to a susceptible node, and you can’t reinfect somebody so this isn’t Omicron. That’s, that’s coming, coming back again. And this is a kind of an emerging infection that confers immunity after you’ve had so the only thing left to happen is is for the disease to die out. And then this is the final state. So one person didn’t get it and all of these other three dead so how do we quantify the kind of risk dangerousness of this disease. So in the epidemiology literature, we use this concept called transmissibility. So this is basically the probability for infectious known to pass the disease to a neighbor. So I have an infectious contact, they have disease, what’s the chance that they catch it at all, at any point. So that’s just this quantity big T, which is the total integral of the delay distribution. And if that’s less than one, that means that there’s a chance that the delay is infinite, that’s the chance we never spread at all. So big t is some number in the range zero to one. And this quantity is super important. Because if it’s small, then the disease is going to die out really quickly. If there’s a low chance of it spreading, then it just a few people are infected, there’s a minor outbreak, and then it dies out. If it’s large, then it can spread through the whole network. So here’s a little example. So here’s some sort of social network, let’s say, and this is the initial infected individual, and we’ve got a high transmissibility. So as this disease progresses, it’s going to spread quite rapidly through the network. And leaving behind this kind of core of people who have had the infection, and only a few are going to escape with the high transmissibility. Now, this is sort of one random instance, you can think of repeating this as a simulation model several times. Or you can think of it as saying, you know, what’s the probability of each of these individual nodes getting infected at some point.

If we now try again, with a lower transmissibility, then the, there’s going to be less successful transmissions, and probably the disease is going to die out quite quickly, without managing to infect a bulk of the network. And there’s going to be a regime shift between these two, we call it a phase transition. So So for low transmissibility, you get tiny little outbreaks, and then suddenly, there’s gonna be a point where large macroscale outbreaks become possible. So these are the kinds of questions I have about this process. So what’s the critical value? Or threshold for transmissibility in a given network? So can we kind of classify a network as to how infectious disease has to be in order to spread on it? Which nodes are most likely to be infected? How fast does the epidemic spread and which nodes will be infected first. So I’d really like to dig into this structure and say, not just a network level property of how likely is an outbreak but I want to say, okay, these five nodes, and the key ones are the most at risk, they’re likely to get at first, they’re the ones we should watch or protect, or invest some resources in summer. And I’m going to use some Applied Maths to try and get at this. And the method is called message passing. And message passing is a ..

**Sheri Markose**

Some missing something in your transmissibility function, shouldn’t you not have had a way to the adjacency matrix and a summation sign? I mean, effects do not do it, would it?

**Tim Rogers**

So an alternative, an equivalent alternative would be to write down weights on the edges. And then say, only those edges whose weight is above some threshold are infectious. That would be absolutely equivalent. And that that’s called a percolation process. The kind of difference in terms of, although mathematically, they’re the same, there’s a slight difference in how you think about it. Because the weighted Edge version you think of as this kind of one event, which is writing all the weights, and then deleting the edges, and then you look and see what’s left. Whereas in this kind of spreading picture, I’m thinking each time I encounter an edge, I decide how long I wait before.

**Sheri Markose**

So that is common global distances, transmitted transmissibility rate is global to everybody is it?

**Tim Rogers**

Yeah, T is a fixed parameter. That’s the same for everybody. In this model, you could make it different for different people. And if you wanted to sort of, there’s various kind of fine tuning and more realistic things you could add one of which would be this, but for the sake of this work, I’m going to keep it fixed for everybody.

**Sheri Markose**

Okay.

**Tim Rogers**

Okay. So I’m conscious already of time, there is going to be a wash of equations coming. But I’ve put the links for the papers. So if I’m going to blast through the maths quite quickly, and if anybody wants to look at the detail, you know where to find it. So here’s the idea, we’re going to define a system of random variables, that tells us the information we want from this model. So for all nodes are going to have no design nodes J, two T i is going to be the infection time of no die. And I’m going to call that TI equals infinity in the case that no no die is never infected. So if it’s a finite number, it’s when it got infected it’s infinity, that means it never got infected. And then T, I’m going to use this notation that has a subscript I from J, that’s the time that no J has infectious contact with node I. And again, I’m going to use infinity in the case of no such infectious contact takes place. So that subscripts is telling you the direction of the transmission of the disease. Okay? So these are random times. And then one more thing, big X is going to be the delay time. So it’s the delay between node J being infected, and successfully passing it to a neighbor I. Okay, so when I receive the disease, for each of my neighbors, I wait a random amount of time before I transmit to them. And for some of them, I wait infinitely long, and then they never receive it. So these are my random variables, I now need to figure out the relationship between them that’s described by the network. And it works as follows. So TI, this is the time that node I gets infected. And that’s just the minimum of the times that its neighbors had infectious contact with it. So the time that I’m infected is the first time somebody infectious has infectious contact with me. So this is kind of obvious, the next one is a little less obvious. So what is the first time that somebody has that node J has infectious contact with node I ,will first of all, node j has to receive the infection from somewhere else in the network. So I look at the minimum over the other nodes K who are neighbors with J, but that are not I have, when did they have infectious contact with J, that’s J receiving the infection, and then it has to pass it on. So then I add on this random delay, x. So here, I’m looking at the event that k gets sick, passes it to J and then J passes it i. Okay, so this now is a system of equations that relates a bunch of random variables on the network. So as these are written, these are relationships between random variables, I’d now like to translate them into relationships between the density functions for those random variables, and then I can start doing analysis. So let’s that capital F, this is going to be the probability of node I being infected before time t. And similarly, FI from J is the probability that the transmission event happened before T. Now, I’m going to make crucial assumption of independence, which is not technically true. But it turns out to be good enough that we get really, really good results on most networks. So it’s going to work for us for now. So we can work out, you know, what’s the relationship described by this. So this equation was about random variables. And then this is the corresponding equation for the density of those random variables. And then the same thing happens when we look at the transmission events, it’s a bit more complicated because we we’re not just taking a minimum, we have to add on something. So we end up with a with a convolution integral over the delay period. So this is the full equation, and this is called the message passing equation that describes the relationship between when do I think J had infectious contact with K and compare that with when do I think I had an infectious contact with them. So this now is the system of equations that relates functions that are live on the edges of the graph, right, so I’ve got a, I’ve got a large network. So I’ve got lots of edges. And on those edges, I’ve got these functions, and they relate to each other according to this equation of a system of equations. So ideally, I now need to try and solve the system.

So these, as I said, these are called the message passing equations. In other literature, they’re sometimes called cavity method, or belief propagation. So belief propagation comes from the idea that node J might have some information that it’s transmitting to node i. And this equation is about how you integrate the information you get from different sources and then pass it on. Alright, so how are we going to solve this? So firstly, let’s simplify things a little bit and just look at the case of are you infected eventually, so it doesn’t matter when it happened, Just whether it happened or not. So for that case, I’m going to set little T to be infinity. And I’m only interested in computing something I’m going to call the riskiness of a node. So Ri, this is just F i of infinity, meaning did I receive the infection before time infinity, so did I get it at all? And then this relates to the neighbors as follows. So this equation by the way, you can read again And then check that it works with your intuition. So this says, what’s the probability of node I being infected? Well, it’s one minus the probability that it’s not infected. And for it not to be infected, it needs to have failed to be infected by each of its neighbors. So the product means all of the neighbors have to fail. And then this is the probability of their failure. And then for these guys, we can do the same thing, we plug it into the more complicated equation, the integral now because it goes to infinity, I’m not worried about the full shape of little f, I’m only interested in big t I ended up with this algebraic system. So this tells me how the riskiness of an edge between the two nodes relates to the riskiness of the other edges that are incident on that. So this is a nonlinear system of equations. But if I can solve it, and I can make predictions about where is at risk in this network. So there’s a general procedure, then we can outline so you’ve got some disease. So step number one, you compute the transmissibility by doing this integral. Step number two, you solve this system of equations. And step number three, you plug it in here to deduce the load risk. Now, I’ve said this is easy, but it’s not super easy, because this step here is, you know, this is a large nonlinear system. So pretty much numerical solution is the only option we’ve got. Luckily, you can just iterate. Now in order to do this iteration, I gonna have to introduce another network object, which turns out already exists in the literature, it’s something called a non backtracking graph, or a Hashimoto graph, because it showed up in number theory back in the early 90s. And then was kind of rediscovered in this network setting quite recently. So this is the graph that describes the relationships between the messages in this message passing process. So it’s nodes are ordered pairs of nodes in the underlying graph, and its edges. So I draw an edge between the pair LK to the pair ji, if K is J, and L is not I. Now usually, when something like this crops up, I try and draw a picture so I can understand it better. But actually, that’s not very helpful in this case, as you’ll see. So if I draw a super simple little graph like this, if I then draw the Hashimoto graph on top, I end up with a horrible sort of spidery mess. But I can describe what’s happening here. So this, this black network is the hash motor graph of these relationships between these variables. And so each edge in the underlying graph spawns two nodes in this new Hashimoto on one for each direction that you could walk down it. And then I joined them up if it’s possible to walk down one edge, and then the other one without turning around. So there’s a line from here to here, because it’s possible to traverse this pathway, and then this pathway without going backwards, that’s all it is.

**Sheri Markose**

So it’s the intuition of the non backtracking. So saying, wouldn’t other people call it an encyclical or other not other ways in which we could understand

**Tim Rogers**

It’s not necessarily a cyclical, it may have longer cycles, but it’s removing the very short cycles. The intuition for why this is an important object is quite simple. Because if I catch the disease from my wife, then she probably didn’t catch it from me. So the path that the disease has taken through the network is non backtracking. And the same follows for information flow, and the same for credit contagion and a whole bunch of process.

**Sheri Markose**

very interesting. This finance doesn’t mean

**Tim Rogers**

exactly. So anytime you have this kind of immediate non reciprocity, then then it’s probably not the adjacency matrix you should be thinking of, it’s probably this Hashemites. Very, yeah, that’s the big takeaway from this whole talk. Yeah, thank you.

**Anindya Chakrabarti**

this directed acyclic graph or no, there will be there can be possible cycles

**Tim Rogers**

There will be cycles, you can probably spot one in here, around long enough. And,

**Anindya Chakrabarti**

but it has a causal interpretation. I think that yeah,

**Tim Rogers**

exactly. Yeah. Yeah. I think the other crucial element is that the cycles are long. So usually, they traverse a large part of the network before you actually come back to you. And in fact, they turn out to be long enough that they’re not terribly important in the analysis, which is a very helpful feature. And, okay. So, written in terms of this Hashimoto matrix, my nonlinear system has a slightly simpler form. I can think of a map that basically so on each edge for each direction, I have a guess of what I think the risk is. And then I can update it according to this map that integrates each of the risks coming in and then gives me a new number here. So solving the message price equations is basically finding a fixed point. And that also then gives us a way of getting at things like stability, because we can examine the stability effects points, and so on to decide on the different solution regimes. And so as I’ve described, simple iterations still works. And here’s the here’s the key picture. So this is a typical kind of result you get from analyzing risk in a standard kind of, kind of randomly generated network. So here, I’ve got a network with something like 500, or 1000 nodes. And I’ve considered various different values for this transmission parameter t. And for each of those, I’ve calculated for each node, what is its risk. And those are these little gray lines here. So every node has its own little profile that describes how its risk increases, a risk of getting the infection increases as the transmissibility gets higher. So the things that stand out in this picture are firstly, this threshold down here that there’s this, there’s a special value below which risk is basically zero, and then suddenly, it explodes. And then the other key thing is that the, these paths are not kind of neatly layered, they can cross over each other. So I picked out two that I then did some simulations to sort of double check. So the purple dashed one and the green solid one. So this pair is a pair that reverse their ranking of risk. So these so the green is more at risk than the than the purple when t is quite low, but then they switch over for t quite high. So there’s some kind of complicated process going on here that describes how is risk apportioned in these networks. And it’s different for large T and for low t. So I’m looking at time, I’ve got quite an hour, I’m not going to show you how I did it, you’ll have to read the paper to find out how I’m going to blast through these slides. But I’m going to answer these questions about the epidemic threshold and the rank order of risks. And it’s all to do with stability of that of that system. So that’s go away. It’s basic, a basic stability analysis. And here is the answer. You end up eventually with a linear equation that says take your your matrix B that corresponds to the adjacency matrix of that of that Hashimoto graph, I showed you the non backtracking graph to write that non backtracking graph as a matrix, look at that matrix, find its largest eigenvalue, and that is one over the critical transmission threshold. So that matrix contains crucial information about about how transmissible disease has to be in order to spread in this network. And moreover, the ranking of risk comes from the dominant eigen vector of that matrix. So the rest

**Sheri Markose**

will be back in their very familiar territory then,

**Tim Rogers**

right, exactly. So it’s all it’s all eventually, it’s all eigenvalues and eigenvectors. So yeah, so you compute this top eigenvector. And that tells you for each one of these nodes, sort of who is who is relatively most at risk, and the behavior is linear near that transition point.

**Anindya Chakrabarti**

So just this quick question is, then the convergence of Markov chain results basically, that it will converge to the dominant eigen vector.

**Tim Rogers**

There’s not necessarily a Markov chain underpinning this, it’s more that there’s a there’s a collection of functions that are related to each other in this nonlinear way. I can iterate an algorithm to find the solution, and it’s about the stability of that solution. Okay, and then up to the other end. And the other end things are actually more simple. It’s all about how many neighbors you have. So Taylor expanding around certain transmission, and you only get a non zero coefficient in the expansion at the order of the number of neighbors you’ve got. So what that basically says is, every extra contact is a proportional extra chance to become infected in the highly infectious regime. So returning to this picture, I can answer those questions. This point here is one over the largest Eigen value of the non backtracking matrix. Each of the lines is leaving as a straight line from that point, and the slope is given by the corresponding entry of the eigenvector. And then at this end, the the node the paths separate into bands and your band is decided by Wait, what degree you have as a node in the network? Okay. So that’s asking all that answering all the static questions. I promised you stuff about speed. I’ve got about 10 minutes, I think. So I’m going to go nice and quickly through speed. So how do we make that analysis that was all about the static situation. However, we now have the questions of how fast it’s going to spread, and who’s going to get it first. So we need to, at some point, I derive these equations relating the time of infection, and then I put t to infinity because I was only interested in the endpoint. But now I need to roll that back and think about general times. And now the exact shape of the distribution of delays is going to become important again. So this is my equation. I’m augmenting it with a superscript of n. Because they’d like to think how far am I from the source in terms of steps in the network. So big FJ from I m t is going to be the probability that no J has not transmitted to the infection to I before time t, given that the initial infection was n steps away in the network. And I can relate that then to what happens when the initial infection is only n minus one steps away. So this is a bit like saying, Okay, I’m personally worried about catching a disease. So what I’m going to do is I’m going to survey all of my friends and family, and work out how close they are to their friends and family, to their friends and family, I’m going to chase it all the way backwards until I find the source. And now I can think about the possible pathways that the disease might take. And this is one step of that chasing process is how do I go through from distance N to N minus one, so closer to the source? When you calculate these things, for standard, or simple networks, this is the kind of picture you see. So if you’re a distance one from the source, you’re just waiting to have it transmitted to you. And this blue curve describes how long you’re gonna have to wait. If you’re a distance two you have to wait for that first transmission event. And then you have to wait for it to be transmitted to you from one of your immediate neighbors. And you get a slightly different curve, and three, and four, and five and six. And eventually, this shape stops changing. And eventually it settles down to a constant sort of shape of a function that’s just shifted right? by a small amount tau. So this tau is the typical delay between one person at distance 100, say, receiving the infection, and then passing it on to somebody at distance 101. So the typical delay between layers 100, and 101, is about the same as the typical delay between layer 1000 and 1001. Of more. And the shape is basically the same, it settles down after a while. So what I’d like to do is calculate what’s that shape? What’s this delay tau?

**Sheri Markose**

Sorry, what was on the x axis?

**Tim Rogers**

It’s time. So it’s just saying the probability of infection prior to a certain time. So the longer you wait, the infection gets more likely. But the further you are from the network, the more you’re going to have to wait for it to reach near you. And then for you to get it.

**Sheri Markose**

So time, I mean, so what is the point five, then

**Tim Rogers**

Arbitrary units, these units, actually, I can tell you the units, it’s units of typical infectious period. So the blue curve here is the easiest one to think about. It’s just the mean of that have that have when it was transmitted, that’ll be one. But these are, so that would be how we measure time, basically. Okay. So I’m going to, again, approach it from a linear point of view. And the linearity is only present in one particular regime, which is when we think about very fast transmission. So what’s the possible probability that I’m infected very early on in the epidemic? Well, if I’m going to get the disease unusually early, I only need one of my neighbors to have got it unusually early. So there’s a linear there’s an additive linear effect there, that every extra neighbor that I have is another chance to for me to get it unusually early. So that thinking about sort of these these early times, allows us to take this piece here and to linearize it when f is very small. So that’s going to become that product that messy nonlinear thing is going to become a sum, which is starting to get nicer. We still have to deal with this integral. And we’re going to do that with a Laplace transform. So if I’ve got an engineering student knows if you’ve got a sort of convolution integral like this, if you’re Laplace transform it, you end up with something nice and multiplicative. And now we’re back in linear land, right, I’ve got another linear system involving the non backtracking matrix, this sum here is the sum over the neighbors in a non backtracking matrix, this f little f, this is the Laplace transform of the delay distribution, this is how the characteristic of the disease feeds into the spreading time. And then this piece here is this is again, this this linearity coming from non backtracking. So what do we expect, we expect to see exponential growth, and then we can if we plug that in as Nan’s x and solve, and indeed, we find a formula that tells us what our typical delay is, it’s kind of it’s not intuitive, it’s a maximum over a difference. logarithms, it’s a weird sort of thing, you have to do quite a bit of work, I’ve condensed it in this slide. And you have to do quite a bit of work to get this. But it turns out to work beautifully. So here is our predictions of different speeds and different networks. So each dot here is representing a network that we’ve downloaded from this snap database, that we then run a bunch of simulations of epidemics on, we’ve calculated the typical spreading speed. And we’ve compared that to our linear theory for this delay, and we find this nice correspondence so we actually get a very accurate prediction for the speed of spread or the typical delay in the layers as the spread through the network. It also lets us

**Sheri Markose**

Remember you said the speed is something to do with the distances in the network. So again, last year, what is tau?

**Tim Rogers**

Right, so tau is delayed towers, one overspeed? Oh, Tara towers delay. So why over towers, the speed,

**Sheri Markose**

the summer calculated these distances in the network, by that I mean, I understand this network, I mean, the first first, you know, people who are what one one removed from me when I needed to, yeah, so that those are what you mean. So he had the potlights, right.

**Tim Rogers**

Yeah so not quite saying, if a node is has a path of length 10 to the source, then the time I expect it to be infected as 10 tau. Tau is how long I expect to take the disease to spread over one distance. Okay. This is showing you how tau can vary with the properties of the disease itself. So the time in the delayed spreading. And basically, the message here is that having an infectious period that’s tightly concentrated around the typical value leads to slower spreading than having an infectious period that can be very large or very small. It’s really the anomalously small events that determine the overall speed of spread. And I’ll blast through this. But another another. Final result, then is when you know who is going to get it first. And again, it turns out to be exactly the same result as progress only it’s a log linear relationship. So I previously showed you how you can calculate a guess you know, sort of a guess a kind of a number that codifies the risk of a particular vertex. Turns out if you take the log of that number, that will give you a very good prediction of when in the disease in the epidemic outbreak is that node going to get going to get the disease. So these are a bunch of simulations from a bunch of different types of networks always showing this log linear relationship between this centrality quantity that we compute and the time to infection. And lastly, one final sort of interesting feature from this. So so this relationship I showed you here about the log linear relationship between between risk and infection time, risk is something that we calculate as a purely local property. It doesn’t know how big the network is. It’s all to do with just an eigenvector, and it doesn’t have any clue about how large the overall network is. So what that means is that if you think about larger and larger networks, so your social network, going all the way up to the scale of the whole world. The calculation of risk is gonna change. And therefore as logarithm is going to change, and therefore the expected time until you get sick is going to change. So what that means is that we expect a very explosive, epidemic spread process, we expect a slow start, and then a very sudden period a short period of time, and which spreads to almost all of the network, and then a slow sort of mop up. But yeah, so this is kind of bad news in the sense because it says that, you know, if you if you watch an outbreak on a timescale that lets you see the start and the finish, it means that almost all the infections happen in a very tiny window of time right in the middle. There’s a real danger period right in the middle. Okay, I’m going to wrap up, these are the summary of what we figured out. So first important question, what is this epidemic threshold for particular network? Well, we find it’s the reciprocal of this non backtracking matrix of the top Eigen value of the non backtracking matrix. What are the risk profiles of the individual vertices? Well, that’s proportional to this non backtracking centrality, which is this top eigenvector of this Hashimoto matrix, that’s for low risk for so for low transmission for high transmission, it’s basically just degree. And then on speed of infection, we have this this strange looking formula that does a good job of predicting how fast an epidemic can spread through a particular network. And that marries function, you know, properties of the disease. This is the delay time with properties of the network itself, and how the two things combined to set this this delay time. And then finally, when is the it’s my personal expected time to receive the infection? Well, it’s proportional to the log of mine on backtracking a sentence.

So just to very finally finish, I want to sort of issue a sort of warning for all of this research. I’m not 100% convinced that it’s something that we want. And this is a new story from early on in the COVID, where South Korea, if I remember correctly, had a very stringent track and trace process that was also quite public, I think they were releasing a lot of information about where people had been, so on and so forth. They were really putting effort into putting together these networks, and then using them to predict who ought to be tested, who ought to isolate, etc, etc. And this made a lot of news around the world, not because it was a marvelous way of stopping the disease, but because it had questions about people’s privacy, and liberty, and so on and so forth. And so I want to sort of caution that all this personalized, say, you know, it’s very interesting from a mathematical point of view. But is personalized predictions of this sort of thing, really the the direction we want to take? I’m not sure. So that’s it. Thank you. I’m very happy to take questions, although I’ll be eating into the next speakers time. So maybe we should take it very brief.

**Sheri Markose**

just very quickly, that is very fascinating. Maybe the thing that I’ve learned is about your Hashimoto matrix, if you weren’t using one of those matrices, what how would it be different? Because all the results look very familiar to me in one slide that one? Yeah.

**Tim Rogers**

So I’ve mentioned a couple of times there’s this thing on backtracking centrality is the name in the literature for this top eigenvector that was invented first, nothing to do with epidemics, it was invented as just a another nice metric you can calculate. And it was supposed to fix a problem. And the problem is. When it first came to the network science , it was supposed to fix the problem. And the problem is that most of the kind of first attempts of metrics that you decide to put on a network, they get very heavily concentrated on high degree vertices. And our neighbors have high degree vertices, and you get real sort of sort of useful results sometimes Yeah, with these high degree vertices. This was designed to counteract that. And so that, and that’s the kind of key difference is that so firstly, what I’ve shown you here as a principal derivation, that is correct in a certain limit, right? Secondly, if you compared it to another measure, just like a pure Eigenvector Centrality or something, you get the wrong answer. And the reason you would get it wrong is because it would be putting too much weight on highly degree but high degree vertices and cliques and things like that, essentially, because it wouldn’t be ruling out transmission, you know, back and forth among neighbors many times.

**Sheri Markose**

It’s very interesting. Yeah. So, so that’s very valuable. There’s so many of us who work on financial networks and I really think we should look into this

**Tim Rogers**

Right, exactly. Yeah.

**Sheri Markose**

That’s brilliant. Tim, that’s my questions done..

**Tim Rogers**

Do we have time for one more

**Ben Etheridge**

Yeah, question for Nikiforos

**Nikiforos Panourgias**

thank you very much, Tim. It’s, it’s, it’s absolutely very, very, very interesting. I’m not a mathematician, so I apologize if I’ve missed something along the line. But I was wondering, you know, in cases where people are not so structurally sort of well defined within the network, and thinking, I’ll just give you an example. I was at the beginning of the epidemic, I was working in Leicester, but live in London and traveling between the two and say, less, there was a was a was a sort of hotspot at the time. So I’m just wondering whether you would model it as something within the parameters when you were particularly sort of looking at the edges. Because, you know, I’m obviously so two times a week sort of going backwards and forwards, I went to train Yeah, I’m, I go to my family, where my wife is actually, you know, just living in the house and not going anywhere. But I’m just wondering whether you would see it as kind of two separate networks, which with some people basically acting as links between the network model that you’ve set up or whether it’s built in to the actual parameters. So I apologize if I missed that. And just a second one was with the with the belief propagation and thinking, yeah, that would probably work with things like, you know, spread of ideas, fake news sentiment. Also, even I don’t know, perceptions of an innovation, you know, things like that. So just just an interesting kind of side there

**Tim Rogers**

So so under your second question, the answer is yes. Absolutely. And there’s lots of research in that direction. On the first question, that’s a very valid point. I think the short answer is to say that in the work I’ve presented, the network is considered as being static. And, and or at least changing very slowly with respect to the timescale of the disease. So maybe a better model of inflammation rather than rather than necessarily an epidemic. And what you’re raising about how contact networks change in time is superduper. Important, and is an is a really interesting challenge. For me, all I’ll say is I’ve looked into it. And really interestingly, all of this stuff breaks. And it should work perfectly. And it’s really interesting that it breaks. And the reason that breaks is because the process I described about sort of iterating while you look further and further away in the network, when you think about a network that changes in time, you’re also looking further back in time. And eventually you reach the beginning of time. And what was there, there was no network there. And then the propagation process fails. And so we really, it’s, it’s quite interesting it needs it turns out that it’s gonna need some fundamentally new ideas. I think, actually, that this mathematics as it stands, won’t work on its own. We need some new theory there. And I don’t know what that new theory is by have devoted a lot of times right. So hopefully, I can come back in another seven years. It’s heavy.

**Nikiforos Panourgias**

It’s really, really interesting, then I think it’s a super super area. Yeah, absolutely. I’m very fascinated.

**Tim Rogers **

Cool. Thank you.