Big Data, Agents and the City

In the recently published book “Big Data for Regional Science” edited by Laurie Schintler and  Zhenhua Chen, Nick Malleson, Sarah Wise, and Alison Heppenstall and myself have a chapter entitled: Big Data, Agents and the City. In the chapter we discuss how big data can be used with respect to building more powerful agent-based models. Specifically how data from say social media could be used to inform agents behaviors and their dynamics; along with helping with the calibration and validation of such models with a emphasis on urban systems. 
Below you can read the abstract of the chapter, see some of the figures we used to support our discussion, along with the full reference and a pdf proof of the chapter. As always any thoughts or comments are welcome.

Abstract:

Big Data (BD) offers researchers the scope to simulate population behavior through vastly more powerful Agent Based Models (ABMs), presenting exciting opportunities in the design and appraisal of policies and plans. Agent-based simulations capture system richness by representing micro-level agent choices and their dynamic interactions. They aid analysis of the processes which drive emergent population level phenomena, their change in the future, and their response to interventions. The potential of ABMs has led to a major increase in applications, yet models are limited in that the individual-level data required for robust, reliable calibration are often only available in aggregate form. New (‘big’) sources of data offer a wealth of information about the behavior (e.g. movements, actions, decisions) of individuals. By building ABMs with BD, it is possible to simulate society across many application areas, providing insight into the behavior, interactions, and wider social processes that drive urban systems. This chapter will discuss, in context of urban simulation, how BD can unlock the potential of ABMs, and how ABMs can leverage real value from BD.  In particular, we will focus on how BD can improve an agent’s abstract behavioral representation and suggest how combining these approaches can both reveal new insights into urban simulation, and also address some of the most pressing issues in agent-based modeling; particularly those of calibration and validation.

Keywords: Agent-based models, Big Data, Emergence, Cities.

The growth in Agent-based modeling -from search results of Web of Science and Google Scholar.
Hotspots of activity of Tweeter Users: Tweet locations and associated densities for a selection of prolific users.

Full Reference:

Crooks, A.T., Malleson, N., Wise, S. and Heppenstall, A. (2018), Big Data, Agents and the City, in Schintler, L.A. and Chen, Z. (eds.), Big Data for Urban and Regional Science, Routledge, New York, NY, pp. 204-213. (pdf)

Continue reading »

Modeling the Emergence of Riots: A Geosimulation Approach

As you might of guessed the paper is about riots but that is not all. In the paper we have a highly detailed cognitive model implemented through the PECS (Physical conditions, Emotional state, Cognitive capabilities, and Social status) framework based around identity theory. The purpose of the model (and paper) is to explore how the unique socioeconomic variables underlying Kibera, a slum in Nairobi, coupled with local interactions of its residents, and the spread of a rumor, may trigger a riot such as those seen in 2007. 
In order to explore this question from the “bottom up” we have developed a novel agent-based model that integrates social network analysis (SNA) and geographic information systems (GIS) for this purpose. In the paper we argue that this integration facilitates the modeling of dynamic social networks created through the agents’ daily interactions. The GIS is used to develop a realistic environment for agents to move and interact that includes a road network and points of interest which impact their daily lives.
Below is the abstract and a summary of its highlights in order to give you a sense of what our research contribution is. In addition to this we also provide some images either from the paper itself or the from Overview, Design Concepts, and Details (ODD) protocol. Finally at the bottom of this post you can see one of the simulation runs, details of where the model can be downloaded along with the full citation.

Paper Abstract:

Immediately after the 2007 Kenyan election results were announced, the country erupted in protest. Riots were particularly severe in Kibera, an informal settlement located within the nations capital, Nairobi. Through the lens of geosimulation, an agent-based model is integrated with social network analysis and geographic information systems to explore how the environment and local interactions underlying Kibera, combined with an external trigger, such as a rumor, led to the emergence of riots. We ground our model on empirical data of Kibera’s geospatial landscape, heterogeneous population, and daily activities of its residents. In order to effectively construct a model of riots, however, we must have an understanding of human behavior, especially that related to an individual’s need for identity and the role rumors play on a person’s decision to riot. This provided the foundation to develop the agents’ cognitive model, which created a feedback system between the agents’ activities in physical space and interactions in social space. Results showed that youth are more susceptible to rioting. Systematically increasing education and employment opportunities, however, did not have simple linear effects on rioting, or even on quality of life with respect to income and activities. The situation is more complex. By linking agent-based modeling, social network analysis, and geographic information systems we were able to develop a cognitive framework for the agents, better represent human behavior by modeling the interactions that occur over both physical and social space, and capture the nonlinear, reinforcing nature of the emergence and dissolution of riots.

Keywords: agent-based modeling; geographic information systems; social network analysis; riots; social influence; rumor propagation.

Paper Highlights:

  • An agent-based model integrates geographic information systems and social network analysis to model the emergence of riots. 
  • The physical environment and agent attributes are developed using empirical data, including GIS and socioeconomic data. 
  • The agent’s cognitive framework allowed for modeling their activities in physical space and interactions in social space. 
  • Through the integration of the three techniques, we were able to capture the complex, nonlinear nature of riots. 
  • Results show that youth are most vulnerable, and, increasing education and employment has nonlinear affects on rioting.

The high-level UML diagram of the model
A high-level representation of the model’s agent behavior incorporated into the PECS framework

An example of the evolution of social networks of ten Residents across the first two days of a simulation run.

The movie below shows the agent-based model which explores ethnic clashes in the Kenyan slum. The environment is made up of households, businesses, and service facilities (such data comes from OpenStreetMap). Agents within the model use a transportation network to move across the environment. As agents go about their daily activities, they interact with other agents – building out an evolving social network. Agents seek to meet their identity standard. Failure to reach their identity standard increases the agents frustration which can lead to an aggressive response (i.e. moving from blue to red color) such as rioting.

As with many of our models, we provide the data, model code and detailed model description in the form of the ODD protocol for others to use, learn more or to extend. Click here for more information.

Full Reference:

Pires, B. and Crooks, A.T. (2017), Modeling the Emergence of Riots: A Geosimulation Approach, Computers, Environment and Urban Systems, 61: 66-80. (pdf)

As normal, any thoughts or comments are most appreciated.
 

Continue reading »

Modeling the Emergence of Riots: A Geosimulation Approach

As you might of guessed the paper is about riots but that is not all. In the paper we have a highly detailed cognitive model implemented through the PECS (Physical conditions, Emotional state, Cognitive capabilities, and Social status) framework based around identity theory. The purpose of the model (and paper) is to explore how the unique socioeconomic variables underlying Kibera, a slum in Nairobi, coupled with local interactions of its residents, and the spread of a rumor, may trigger a riot such as those seen in 2007. 
In order to explore this question from the “bottom up” we have developed a novel agent-based model that integrates social network analysis (SNA) and geographic information systems (GIS) for this purpose. In the paper we argue that this integration facilitates the modeling of dynamic social networks created through the agents’ daily interactions. The GIS is used to develop a realistic environment for agents to move and interact that includes a road network and points of interest which impact their daily lives.
Below is the abstract and a summary of its highlights in order to give you a sense of what our research contribution is. In addition to this we also provide some images either from the paper itself or the from Overview, Design Concepts, and Details (ODD) protocol. Finally at the bottom of this post you can see one of the simulation runs, details of where the model can be downloaded along with the full citation.

Paper Abstract:

Immediately after the 2007 Kenyan election results were announced, the country erupted in protest. Riots were particularly severe in Kibera, an informal settlement located within the nations capital, Nairobi. Through the lens of geosimulation, an agent-based model is integrated with social network analysis and geographic information systems to explore how the environment and local interactions underlying Kibera, combined with an external trigger, such as a rumor, led to the emergence of riots. We ground our model on empirical data of Kibera’s geospatial landscape, heterogeneous population, and daily activities of its residents. In order to effectively construct a model of riots, however, we must have an understanding of human behavior, especially that related to an individual’s need for identity and the role rumors play on a person’s decision to riot. This provided the foundation to develop the agents’ cognitive model, which created a feedback system between the agents’ activities in physical space and interactions in social space. Results showed that youth are more susceptible to rioting. Systematically increasing education and employment opportunities, however, did not have simple linear effects on rioting, or even on quality of life with respect to income and activities. The situation is more complex. By linking agent-based modeling, social network analysis, and geographic information systems we were able to develop a cognitive framework for the agents, better represent human behavior by modeling the interactions that occur over both physical and social space, and capture the nonlinear, reinforcing nature of the emergence and dissolution of riots.

Keywords: agent-based modeling; geographic information systems; social network analysis; riots; social influence; rumor propagation.

Paper Highlights:

  • An agent-based model integrates geographic information systems and social network analysis to model the emergence of riots. 
  • The physical environment and agent attributes are developed using empirical data, including GIS and socioeconomic data. 
  • The agent’s cognitive framework allowed for modeling their activities in physical space and interactions in social space. 
  • Through the integration of the three techniques, we were able to capture the complex, nonlinear nature of riots. 
  • Results show that youth are most vulnerable, and, increasing education and employment has nonlinear affects on rioting.

The high-level UML diagram of the model
A high-level representation of the model’s agent behavior incorporated into the PECS framework

An example of the evolution of social networks of ten Residents across the first two days of a simulation run.

The movie below shows the agent-based model which explores ethnic clashes in the Kenyan slum. The environment is made up of households, businesses, and service facilities (such data comes from OpenStreetMap). Agents within the model use a transportation network to move across the environment. As agents go about their daily activities, they interact with other agents – building out an evolving social network. Agents seek to meet their identity standard. Failure to reach their identity standard increases the agents frustration which can lead to an aggressive response (i.e. moving from blue to red color) such as rioting.

As with many of our models, we provide the data, model code and detailed model description in the form of the ODD protocol for others to use, learn more or to extend. Click here for more information.

Full Reference:

Pires, B. and Crooks, A.T. (2017), Modeling the Emergence of Riots: A Geosimulation Approach, Computers, Environment and Urban Systems, 61: 66-80. (pdf)

As normal, any thoughts or comments are most appreciated.
 

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.

“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.

“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

A Semester with Urban Analytics

This past semester I gave a new class at GMU entitled “Urban Analytics”. In a nutshell the class was about introducing students to a broad interdisciplinary field that focuses on the use of data to study cities. More specifcally the emphasis of the cla…

Continue reading »

A Semester with Urban Analytics

This past semester I gave a new class at GMU entitled “Urban Analytics”. In a nutshell the class was about introducing students to a broad interdisciplinary field that focuses on the use of data to study cities. More specifcally the emphasis of the cla…

Continue reading »

“Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?

Recently, Alison Heppenstall, Nick Malleson  and myself have just had a paper accepted in Systems entitled: “Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?” In the paper we critically examine how well agent-based models have  simulated a variety of urban processes. We discus what considerations are needed when choosing the appropriate level of spatial analysis and time frame to model urban phenomena and what role Big Data can play in agent-based modeling. Below you can read the abstract of the paper and see a number of example applications discussed.

Abstract: Cities are complex systems, comprising of many interacting parts. How we simulate and understand causality in urban systems is continually evolving. Over the last decade the agent-based modeling (ABM) paradigm has provided a new lens for understanding the effects of interactions of individuals and how through such interactions macro structures emerge, both in the social and physical environment of cities. However, such a paradigm has been hindered due to computational power and a lack of large fine scale datasets. Within the last few years we have witnessed a massive increase in computational processing power and storage, combined with the onset of Big Data. Today geographers find themselves in a data rich era. We now have access to a variety of data sources (e.g., social media, mobile phone data, etc.) that tells us how, and when, individuals are using urban spaces. These data raise several questions: can we effectively use them to understand and model cities as complex entities? How well have ABM approaches lent themselves to simulating the dynamics of urban processes? What has been, or will be, the influence of Big Data on increasing our ability to understand and simulate cities? What is the appropriate level of spatial analysis and time frame to model urban phenomena? Within this paper we discuss these questions using several examples of ABM applied to urban geography to begin a dialogue about the utility of ABM for urban modeling. The arguments that the paper raises are applicable across the wider research environment where researchers are considering using this approach.

Keywords: cities; agent-based modeling; big data; crime; retail; space; simulation

Figure 1. (A) System structure; (B) System hierarchy; and (C) Related subsystems/processes (adapted from Batty, 2013).

Reference cited:

Batty, M. (2013).  The New Science of Cities; MIT Press: Cambridge, MA, USA.

Full reference to the open access paper:

Heppenstall, A., Malleson, N. and Crooks A.T. (2016). “Space, the Final Frontier”: How Good are Agent-based Models at Simulating Individuals and Space in Cities?, Systems, 4(1), 9; doi: 10.3390/systems4010009 (pdf)

 

Continue reading »

“Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?

Recently, Alison Heppenstall, Nick Malleson  and myself have just had a paper accepted in Systems entitled: “Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?” In the paper we critically examine how well agent-based models have  simulated a variety of urban processes. We discus what considerations are needed when choosing the appropriate level of spatial analysis and time frame to model urban phenomena and what role Big Data can play in agent-based modeling. Below you can read the abstract of the paper and see a number of example applications discussed.

Abstract: Cities are complex systems, comprising of many interacting parts. How we simulate and understand causality in urban systems is continually evolving. Over the last decade the agent-based modeling (ABM) paradigm has provided a new lens for understanding the effects of interactions of individuals and how through such interactions macro structures emerge, both in the social and physical environment of cities. However, such a paradigm has been hindered due to computational power and a lack of large fine scale datasets. Within the last few years we have witnessed a massive increase in computational processing power and storage, combined with the onset of Big Data. Today geographers find themselves in a data rich era. We now have access to a variety of data sources (e.g., social media, mobile phone data, etc.) that tells us how, and when, individuals are using urban spaces. These data raise several questions: can we effectively use them to understand and model cities as complex entities? How well have ABM approaches lent themselves to simulating the dynamics of urban processes? What has been, or will be, the influence of Big Data on increasing our ability to understand and simulate cities? What is the appropriate level of spatial analysis and time frame to model urban phenomena? Within this paper we discuss these questions using several examples of ABM applied to urban geography to begin a dialogue about the utility of ABM for urban modeling. The arguments that the paper raises are applicable across the wider research environment where researchers are considering using this approach.

Keywords: cities; agent-based modeling; big data; crime; retail; space; simulation

Figure 1. (A) System structure; (B) System hierarchy; and (C) Related subsystems/processes (adapted from Batty, 2013).

Reference cited:

Batty, M. (2013).  The New Science of Cities; MIT Press: Cambridge, MA, USA.

Full reference to the open access paper:

Heppenstall, A., Malleson, N. and Crooks A.T. (2016). “Space, the Final Frontier”: How Good are Agent-based Models at Simulating Individuals and Space in Cities?, Systems, 4(1), 9; doi: 10.3390/systems4010009 (pdf)

 

Continue reading »
1 2 3 4