Categories
Model

Nodes, Neighbours and Footfall

A new simulation model looks at how social networks influence attendance in festivals and movies

The 45-day long Maha Kumbh Mela this year has been singular. Indian officials estimate that 450 million people — comparable to the combined population of the United States and Mexico — visited it within a month of commencement on January 13. Inter alia, the Maha Kumbh this year also marked the largest single-day peaceful human gathering in one place in recorded history. On January 29, 76 million believers congregated near the confluence of the Ganga, Yamuna, and (the-now-lost-to-time) Saraswati rivers in Prayagraj, a city in Uttar Pradesh.

The mind boggles at the scale of the event, and the accompanying logistical challenges: prevention of stampedes and crime, provision of medical treatment, and management of transportation nodes such as railway and bus stations. As such, it provided considerable grist to the behavioral scientist’s and policy analyst’s mills, as millions flocked to the Triveni Sangam in search of spiritual succor, religious camaraderie and, perhaps, spontaneous excitement.

If the Maha Kumbh represents one facet of contemporary India, Bollywood superstar Shah Rukh Khan’s 2023 thriller “Pathaan” represents another. The tale of an Indian Muslim intelligence operative on a global hunt, the film made 130 million dollars worldwide — a tidy hundred million more than its budget — running for more than 50 days straight in theatres. Khan would go on to have another stunning hit in 2023, with a movie about a vigilante fighting endemic corruption. At a time when commentators were all but ready to write Bollywood off — anointing streaming platforms as the giant slayers — King Khan disrupted, to use a word so favored by Silicon Valley.

To be sure, the fact that a Muslim film actor continues to draw cash and crowds at a massive scale runs contrary to persistent Western apprehensions about India’s majoritarian political and cultural turn over the past decade. And if Khan seems like a one-off phenomena, consider this. The British band Coldplay’s show on January 27 this year in Ahmedabad, Gujarat — in the Narendra Modi Stadium! — handily became the largest ticketed concert in Indian history.

To paraphrase Mark Twain, reports of the demise of India’s pluralism have been greatly exaggerated. However, a number of basic questions must first be answered in order to fully understand the import of the examples above.

Just what are crowds — religious and secular, violent and peaceful, segregated or otherwise? How do they form and when?  What role does imitative behaviour play in their formation? What about relations of power and influence, evolving beliefs and expectations, and constraints on behaviour through cost and capacity? Why do crowd sizes vary in similar events, even when all other factors apparently remain the same? How does crowd behaviour change over the course of a single event? And how does footfall vary over time, say in movie theaters and street festivals?

These are old questions. However, since the early 1990s a set of interlinked ideas have done much to answer them. They range from economic models of herding and networked markets to  using methods from statistical physics and dynamical systems, demonstrating the richness of a transdisciplinary approach to simple-to-state but extremely deep problems. The power of such an approach became clear yet again in a paper published by a group of European scientists earlier in the month which showed the emergence of “handedness” in wave-like behaviour of crowds, drawing on high-frequency data from Spain’s San Fermín festival.


Just what are crowds — religious and secular, violent and peaceful, segregated or otherwise? How do they form and when?  What role does imitative behaviour play in their formation?

The Model

Motivated by the Maha Kumbh Mela, we have developed a simulation model of how footfall varies at a multi-time-period event, depending on event capacity, cost of attendance, and susceptibility to influence of (on- and offline) social networks. To be sure, it is not a model of crowd evolution at the Maha Kumbh, which would have to take many other variables into account. However, the generality of our simple model makes it applicable to many other situations, including attendance at movies which run for an extended period of time.

Population structure

We have a population of $N$ individuals forming a small world network of the kind introduced by applied mathematicians Duncan Watts and Stephen Strogatz in 1998. The edges connecting the individuals/nodes denote family and work ties, social media and offline friendships, and community leader-follower relationships. All relationships are given equal weight in the model.

A diagram showing a small-world network with interconnected nodes represented as light blue circles and edges as white lines, highlighting a few nodes in different colors.
A Watts-Strogatz network with $N = 45$, $k = 5$, and $p = 0.4$. The node A has four neighbours: orange denotes the neighbour of A with most number of connections (highest degree) while green marks the other three neighbours
What Are Small World Networks?

The name comes from the small world phenomena first studied by sociologists in mid-twentieth century, and made fashionable through popular culture references to “six degrees of separation”. Watts-Strogatz networks are a category of small world networks. Roughly speaking, they lie in between regular and random ones, and are created from the former — where each node has $k$ neighbours — through a rewiring probability $0 \leq p \leq 1$. The case $p = 0$ corresponds to a regular network, while $p = 1$ makes the network completely random.

Cost, capacity and susceptibility

Each individual in the population has money randomly allocated to them within a range which has a maximum ($m_\textrm{max}$) and a minimum ($m_\textrm{min}$). It may be that any two individuals in the population have the same amount of money. We shall assume that the money allocated to individuals is uniformly distributed. Therefore, the inequality in the population is, in a manner of speaking, structureless.

The individuals in the population want to attend an event

  • which costs $P$ rupees per attendee, and has
  • capacity to accommodate a maximum of $C$ people at any time period, with $C$ remaining constant over time.

The event capacity $C$ has multiple interpretations. The simplest of course is that it is the maximum number of people a theatre or festival ground can accommodate (in case of festivals, as prescribed by fire codes, for example). But it could also be interpreted as the maximum capacity of modes of transportation — such as railway or buses — that allows individuals to reach an event, in case no other capacity metric for the event is readily available. Finally, we can broadly interpret $C$ in ecological terms, as the carrying capacity of an environment, that is in terms of the spatial extent of an event supported by suitable logistical facilities such as availability of transportation to attend said event, and provision of food and medical resources for attendees. This interpretation — as the maximum load an environment (suitably defined) can take to support a population — is particularly apt for an event like the Maha Kumbh Mela. 

We shall set the cost/price of attendance per attendee $P$ as the mean money allocated to the population. This is a reasonable — if somewhat simplified — assumption. After all, events are rarely held which cost more money to attend than what is available to at least some fraction of the population.

We denote the insusceptibility of any individual in the population to influence as $s$, where $0 \leq s \leq 1$. The parameter can be interpreted as follows: A population with $s \approx 0$ consists of individuals who are highly susceptible to being influenced; one with $s \approx 1$ would be very difficult to persuade. 

Decision rules

Any individual in the population will attend the event at time $t$:

  • Affordability if they have the money to do so, that is, the money allocated/available to them (their spending ability, that is) is greater than or equal to $P$, and
  • Neighbour attendance rule if the fraction of their neighbours that had attended the event at time $t-1$ is greater than or equal to $s$, and
  • Leader-follower rule if any of their neighbours with the highest number of connections had attended the event at time $t-1$. (Do note that it is perfectly possible that two neighbours may have the same highest degree. Here the “any” in the condition becomes significant, with “any” replaced by “all” giving potentially different results.)
The Neighbour Attendance Rule and Insusceptibility in Action

The fraction in the neighbour attendance rule simply means the number of neighbours of the individual that attended the event divided by the number of neighbours of the individual. To see how this rule works, let us say an individual has 5 neighbours and 1 of them attended the event at $t-1$. If $s = 0.2$, the individual would have satisfied this rule, since $\frac{1}{5} = 0.2 \geq 0.2$. But if $s = 0.4$, that would not be the case since $\frac{1}{5} \ngeq 0.4$. This illustrates the meaning of the insusceptibility parameter $s$. An $s = 0.2$ corresponds to a more susceptible population than one with $s = 0.4$ since for the former value of $s$, 1 neighbour-attendee would have cut the mustard, which is not the case with the latter.

Relevant Literature

Note that the neighbour attendance and leader-follower rules are examples of sequential behaviour where a node takes a certain action at $t$ based on actions of others at $t-1$. This is typical of many models of herding, including the first such game-theoretic one introduced by the economist Abhijit Banerjee. Sociologist Mark Granovetter was the first to mathematically explore the notion of thresholds in collective behaviour, in terms of how many others must first act for an individual to act after in a given situation. A condition like the neighbour attendance rule was first introduced by Watts.

We impose three further conditions:

  • Non-repetition An individual will attend the event at $t$ if they have not attended the event at any time before, and once they have attended the event, they will not do so again.
  • Initial condition We assume that at $t = 0$, a fraction $f$ of the population attends the event. This fraction of individuals will be chosen at random. Keep in mind that the initial attendees at the event are also not allowed to attend the event at any other time.
  • Capacity constraint In the case that the number of individuals who can potentially attend the event at $t$ — that is, they satisfy the decision rules above — exceed $C$, $C$ out of them will be chosen randomly. This is also applicable to initial attendees.

We implemented the model above in Python to see how the number of attendees at the event changes between $0 \leq t \leq T$, where $T$ is the end of the simulation.

Simulations

In what follows we shall use the following values for the parameters, though the model can be implemented with other choices.

ParameterInterpretationValue
$N$Total population/number of nodes in the network10000
$k$Number of neighbours of each node in the network with a regular ring shape before rewiring5
$p$Rewiring probability ($p = 0$ implies regular while $p = 1$ is random)0.4 (so our network is “closer” to the regular case than the random one)
$C$Event capacity30
$m_\textrm{min}$Minimum money to be allocated to any individual in the population0
$m_\textrm{max}$Maximum money to be allocated to any individual in the population1000
$s$Insusceptibility to influence (smaller values mean higher susceptibility)0.2 (so our population is highly susceptible)
$f$Initial fraction of potential attendees0.05 (= 5%, which is 500 for a population of 10000)

Affordability

Before jumping into simulations with the complete set of decision rules, let us build some intuition about the model and consider the simpler case where an individual who has the money to attend the event would be able to do so, with the only constraint being the event capacity. We expect that the event will see full capacity for a certain length of time during which almost half of the population will attend the event, after which the number of total attendees will fall to zero. This is because roughly half the population — around 5000 people — has the money to attend the event given how we have set up the cost/price of attendance in the model. (A technical remark: Since we are taking the average of values for money drawn from the uniform distribution defined on a half-open interval $[m_\textrm{min}, m_\textrm{max})$, the average will be around 500 and not exactly so. In our simulations, it takes the value 497.43.) 

This line of reasoning checks out in the simulation. 

A graph showing total attendees at an event over time, with a blue line representing total attendees and a red dashed line indicating the event's capacity, displaying fluctuations in attendance.
Simulation using affordability and event capacity

We see that for $0 \leq t \leq 165$, we have full capacity each period, giving us a cumulative total attendance of $30 \times 1 + 30 \times 165 = 4980$ people during that time interval; we also see that 6 individuals attended the event at $t = 166$. This brings the grand total cumulative attendees at 4986, which is indeed roughly half of 10000, the total population.

A graph showing cumulative attendees over time, with the x-axis labeled 'Time Step' and the y-axis labeled 'Cumulative Attendees'. The curve rises sharply, reaching a plateau at approximately 5000 attendees.
Cumulative attendees with affordability and event capacity

All decision rules

In the case that the complete set of decision rules have to be satisfied for an individual to attend the event along with the capacity constraint, we see that only a small number of all those who could afford to attend the event did so. This is not surprising, given — applied iteratively — how stringent the full set of decision rules is, especially the condition that an individual will attend the event if at least one of its most connected neighbours did so in the past time period.

A graph depicting total attendees at an event over time, showing a rapid decline in attendance after reaching maximum capacity of 30 participants.
Simulation using all decision rules and event capacity

We see that a cumulative total of only 62 individuals attended the event. This is about 1.3% of all those who could afford to attend it.

A line graph depicting cumulative attendees over time, showing a gradual increase in attendance, starting from around 30 and rising to over 60 attendees across several time steps.
Cumulative attendees with all decision rules and event capacity

Network theorists would not be surprised by this result. As Watts and Peter Dodds observed in a 2007 paper, cascading behaviour is rarely driven by what we call leaders and they termed “influentials.”

The Movie Footfall Problem

The simulation with the full set of decision rules — along with secondary analysis of box-office collection data — indicates that our model may be applicable to attendance at box office hits, a claim that is in principle testable directly from theatre capacity data. For movies that go on to do very well, theatres typically run in capacity on the opening day and/or a certain length of time afterwards; attendance starts dropping after the initial phase, and eventually comes down to zero (say over a period of a couple of months). Do note that we have to suitably redefine $C$ for any direct empirical test of the model with theatre capacity data, given that we will be aggregating capacity data from multiple theatres.

Affordability and neighbour attendance

The most interesting simulation comes from an “intermediate” set of decision rules, where an individual attends the event if it satisfies the affordability and neighbour attendance rules, ignoring the leader-follower rule.

A line graph depicting the total number of attendees at an event over time, showing fluctuations in attendance while a red dashed line indicates the event capacity.
Simulation using affordability and neighbour attendance rule and event capacity

A cumulative total of 1306 individuals attended the event. This is about 26.2% of all those who could afford to attend it.

Graph showing cumulative attendees over time steps, with values ranging from 0 to 1200 attendees.
Cumulative attendees with affordability and neighbour attendance rule and event capacity

Notice the rapid sustained drops and climb-backs in the total attendees simulation plot. They indicate strong positive feedback effects. Such effects arising out of imitative behaviour within a population has also been observed in finance, in the work of econophysicist Didier Sornette and collaborators. Based on a set of interlinked models, Sornette has attributed bubbles and crashes in equity markets to imitation of noise traders, building another computational bridge between finance and behavioural economics.

The conceptual link between our model and those of financial markets of the kind Sornette and others advocate becomes worthy of further investigation when we plot the day-on-day change in attendance.

Graph showing the day-on-day change in attendees at an event over 100 time steps, with fluctuations indicated in blue.
Day-on-day change in attendance with affordability and neighbour attendance rule and event capacity

Whether day-on-day attendance change can be understood as an analogue of volatility clustering in finance (see page 4 of this paper for a concrete example in case of a single stock) is an open question. What is exceedingly interesting is that such a phenomenon is also present in network models of insurgency violence under development by us.

Concluding Thoughts

The model presented above and the simulations based on it show how social networks influence variations in crowd size under considerably general conditions. There are two overarching conclusions to be drawn from them, one scientific-philosophical and the other more practical.

Our modelling shows that if there are too many constraints on sequential behaviour for an individual (that is, if all three decision rules — affordability, neighbour attendance, and leader-follower rules — have to be satisfied), the resulting dynamics is not very interesting, just rapid decay in attendance. At the other end, if an individual is bound by too few constraints (given by the case when they simply have to satisfy the affordability rule), the behaviour is equally dull.

It is in the intermediate case — where an individual can attend the event if they can afford so, as well as if the fraction of their neighbours who had done so in the previous time period exceeded a certain susceptibility threshold — that we obtain the richest collective behaviour. This fits perfectly with a folk-theorem of sorts in computational modelling of social life which states that for complex behaviour to arise, the constraints on the system must be just right — not too few, not too many either. 

At a practical level, our modelling suggests that for sustained population participation in an event, organizers must mobilize a critical fraction of communities at an initial stage (the affordability and neighbour participation rules) and not rely on convincing community leaders too much (the complete set of decision rules). This has implications for design of social media and grass-root outreach campaigns.

Added references to the literature

Disclosure
The algorithm underlying the model presented above, along with its conceptualisation and associated mathematical analysis, is original, and the work of Tarqeq Research LLP. In order to facilitate rapid development, a commercially-available generative AI was used to implement it as a computer program. However, the program has been reviewed, tested, refined and debugged as needed manually.

Discover more from Tarqeq Research

Subscribe to get the latest posts sent to your email.