Big Data and Design: More Baboon, Less Unicorn

I recently had the pleasure of giving a Creative Mornings talk. Each month there is a new theme that the presenters need to refer to – mine was “fantasy” so I chose to open with one of my favourite fantasy creatures: the unicorn. It’s a talk about the creative process behind Oliver Uberti and I’s […]

Continue reading »

The Full Stack: Tools & Processes for Urban Data Scientists

Recently, I was asked to give talks at both UCL’s CASA and the ETH Future Cities Lab in Singapore for students and staff new to ‘urban data science’ and the sorts of workflows involved in collecting, processing, analysing, and reporting on … Continue reading 

Continue reading »

New Paper: User-Generated Big Data and Urban Morphology

Continuing our work with crowdsourcing and geosocial analysis we recently had a paper published in a special issue of the  Built Environment journal entitled “User-Generated Big Data and Urban Morphology.”
The theme of the special issue is: “Big Data and the City” which was guest edited by Mike Batty and includes 12 papers.  To quote from the website

“This cutting edge special issue responds to the latest digital revolution, setting out the state of the art of the new technologies around so-called Big Data, critically examining the hyperbole surrounding smartness and other claims, and relating it to age-old urban challenges. Big data is everywhere, largely generated by automated systems operating in real time that potentially tell us how cities are performing and changing. A product of the smart city, it is providing us with novel data sets that suggest ways in which we might plan better, and design more sustainable environments. The articles in this issue tell us how scientists and planners are using big data to better understand everything from new forms of mobility in transport systems to new uses of social media. Together, they reveal how visualization is fast becoming an integral part of developing a thorough understanding of our cities.”

Table of Contents

In our paper we discuss and show how crowdsourced data is leading to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics. Specifically how such data can provide us information pertaining to linked spaces and geosocial neighborhoods. We argue that a geosocial neighborhood is not defined by its administrative boundaries, planning zones, or physical barriers, but rather by its emergence as an organic self-organized social construct that is embedded in geographical spaces that are linked by human activity. Below is the abstract of the paper and some of the figures we have in it which showcase our work.
“Traditionally urban morphology has been the study of cities as human habitats through the analysis of their tangible, physical artefacts. Such artefacts are outcomes of complex social and economic forces, and their study is primarily driven by traditional modes of data collection (e.g. based on censuses, physical surveys, and mapping). The emergence of Web 2.0 and through its applications, platforms and mechanisms that foster user-generated contributions to be made, disseminated, and debated in cyberspace, is providing a new lens in the study of urban morphology. In this paper, we showcase ways in which user-generated ‘big data’ can be harvested and analyzed to generate snapshots and impressionistic views of the urban landscape in physical terms. We discuss and support through representative examples the potential of such analysis in revealing how urban spaces are perceived by the general public, establishing links between tangible artefacts and cyber-social elements. These links may be in the form of references to, observations about, or events that enrich and move beyond the traditional physical characteristics of various locations. This leads to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics.”

Keywords: Urban Morphology, Social Media, GeoSocial, Cities, Big Data.

City Infoscapes – Fusing Data from Physical (L1, L2), Social, Perceptual (L3) Spaces to Derive Place Abstractions (L4) for Different Locations (N1, N2).
Recreational Hotspots Composed of “Locals” and “Tourists” with Perceived Artifacts Indicating “Use” and “Need”. (A) High Line Park (B) Madison Square Garden.




Moving from Spatial Neighborhoods to Geosocial Neighborhoods via Links.

The Emergence of Geosocial Neighborhoods after the in the
Aftermath of the 2013 Boston Marathon Bombing

Full  Reference: 

Crooks, A.T., Croitoru, A., Jenkins, A., Mahabir, R., Agouris, P. and Stefanidis A. (2016). “User-Generated Big Data and Urban Morphology,”  Built Environment, 42 (3): 396-414. (pdf)

Continue reading »

New Paper: User-Generated Big Data and Urban Morphology

Continuing our work with crowdsourcing and geosocial analysis we recently had a paper published in a special issue of the  Built Environment journal entitled “User-Generated Big Data and Urban Morphology.”
The theme of the special issue is: “Big Data and the City” which was guest edited by Mike Batty and includes 12 papers.  To quote from the website

“This cutting edge special issue responds to the latest digital revolution, setting out the state of the art of the new technologies around so-called Big Data, critically examining the hyperbole surrounding smartness and other claims, and relating it to age-old urban challenges. Big data is everywhere, largely generated by automated systems operating in real time that potentially tell us how cities are performing and changing. A product of the smart city, it is providing us with novel data sets that suggest ways in which we might plan better, and design more sustainable environments. The articles in this issue tell us how scientists and planners are using big data to better understand everything from new forms of mobility in transport systems to new uses of social media. Together, they reveal how visualization is fast becoming an integral part of developing a thorough understanding of our cities.”

Table of Contents

In our paper we discuss and show how crowdsourced data is leading to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics. Specifically how such data can provide us information pertaining to linked spaces and geosocial neighborhoods. We argue that a geosocial neighborhood is not defined by its administrative boundaries, planning zones, or physical barriers, but rather by its emergence as an organic self-organized social construct that is embedded in geographical spaces that are linked by human activity. Below is the abstract of the paper and some of the figures we have in it which showcase our work.
“Traditionally urban morphology has been the study of cities as human habitats through the analysis of their tangible, physical artefacts. Such artefacts are outcomes of complex social and economic forces, and their study is primarily driven by traditional modes of data collection (e.g. based on censuses, physical surveys, and mapping). The emergence of Web 2.0 and through its applications, platforms and mechanisms that foster user-generated contributions to be made, disseminated, and debated in cyberspace, is providing a new lens in the study of urban morphology. In this paper, we showcase ways in which user-generated ‘big data’ can be harvested and analyzed to generate snapshots and impressionistic views of the urban landscape in physical terms. We discuss and support through representative examples the potential of such analysis in revealing how urban spaces are perceived by the general public, establishing links between tangible artefacts and cyber-social elements. These links may be in the form of references to, observations about, or events that enrich and move beyond the traditional physical characteristics of various locations. This leads to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics.”

Keywords: Urban Morphology, Social Media, GeoSocial, Cities, Big Data.

City Infoscapes – Fusing Data from Physical (L1, L2), Social, Perceptual (L3) Spaces to Derive Place Abstractions (L4) for Different Locations (N1, N2).
Recreational Hotspots Composed of “Locals” and “Tourists” with Perceived Artifacts Indicating “Use” and “Need”. (A) High Line Park (B) Madison Square Garden.




Moving from Spatial Neighborhoods to Geosocial Neighborhoods via Links.

The Emergence of Geosocial Neighborhoods after the in the
Aftermath of the 2013 Boston Marathon Bombing

Full  Reference: 

Crooks, A.T., Croitoru, A., Jenkins, A., Mahabir, R., Agouris, P. and Stefanidis A. (2016). “User-Generated Big Data and Urban Morphology,”  Built Environment, 42 (3): 396-414. (pdf)

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.
“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.
“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

Call For Papers: Smart Buildings and Cities

Special Issue on Smart Buildings and Cities for IEEE Pervasive Computing

Submission deadline: 1 July 2016  Extended to July 18th, 2016
Publication date: April–June 2017

One of Mark Weiser’s first envisionments of ubiquitous and pervasive computing had the smart home as its central core. Since then, researchers focused on realizing this vision have built out from the smart home to the smart city. Such environments aim to improve the transparency of information and the quality of life through access to smarter and more appropriate services.

Despite efforts to build these environments, there are still many unanswered questions: What does it mean to make a building or a city “smart”? What infrastructure is necessary to support smart environments? What is the return on investment of a smart environment?

The key to building smart environments is the fusion of multiple technologies including sensing, advanced networks, the Internet of Things, cloud computing, big data analytics, and mobile devices. This special issue aims to explore new technologies, methodologies, case studies, and applications related to smart buildings and cities. Contributions may come from diverse fields such as distributed systems, HCI, ambient intelligence, architecture, transportation and urban planning, policy development, and cyber-physical systems. Relevant topics for issue include

  • Applications, evaluations, or case studies of smart buildings/cities
  • Architectures and systems software to support smart environments
  • Big data analytics for monitoring and managing smart environments
  • Economic models for smart buildings/cities
  • Models for user interaction in smart environments
  • Formative studies regarding the design, use, and acceptance of smart services
  • Configuration and management of smart environments
  • Embedded, mobile ,and crowd sensing approaches
  • Cloud computing for smart environments
  • Domain-specific investigations (such as transportation or healthcare)

The guest editors invite original and high-quality submissions addressing all aspects of this field, as long as the connection to the focus topic is clear and emphasized.

Guest Editors

Submission Information

Continue reading »

Call For Papers: Smart Buildings and Cities

Special Issue on Smart Buildings and Cities for IEEE Pervasive Computing

Submission deadline: 1 July 2016  Extended to July 18th, 2016
Publication date: April–June 2017

One of Mark Weiser’s first envisionments of ubiquitous and pervasive computing had the smart home as its central core. Since then, researchers focused on realizing this vision have built out from the smart home to the smart city. Such environments aim to improve the transparency of information and the quality of life through access to smarter and more appropriate services.

Despite efforts to build these environments, there are still many unanswered questions: What does it mean to make a building or a city “smart”? What infrastructure is necessary to support smart environments? What is the return on investment of a smart environment?

The key to building smart environments is the fusion of multiple technologies including sensing, advanced networks, the Internet of Things, cloud computing, big data analytics, and mobile devices. This special issue aims to explore new technologies, methodologies, case studies, and applications related to smart buildings and cities. Contributions may come from diverse fields such as distributed systems, HCI, ambient intelligence, architecture, transportation and urban planning, policy development, and cyber-physical systems. Relevant topics for issue include

  • Applications, evaluations, or case studies of smart buildings/cities
  • Architectures and systems software to support smart environments
  • Big data analytics for monitoring and managing smart environments
  • Economic models for smart buildings/cities
  • Models for user interaction in smart environments
  • Formative studies regarding the design, use, and acceptance of smart services
  • Configuration and management of smart environments
  • Embedded, mobile ,and crowd sensing approaches
  • Cloud computing for smart environments
  • Domain-specific investigations (such as transportation or healthcare)

The guest editors invite original and high-quality submissions addressing all aspects of this field, as long as the connection to the focus topic is clear and emphasized.

Guest Editors

Submission Information

Continue reading »

Algorithmic governance in environmental information (or how technophilia shape environmental democracy)

These are the slides from my talk at the Algorithmic Governance workshop (for which there are lengthy notes in the previous post). The workshop explored the many ethical, legal and conceptual issues with the transition to Big Data and algorithm based decision-making. My contribution to the discussion is based on previous thoughts on environmental information … Continue reading Algorithmic governance in environmental information (or how technophilia shape environmental democracy)

Continue reading »

Algorithmic Governance Workshop (NUI Galway)

The workshop ‘Algorithmic Governance’ was organised as an intensive one day discussion and research needs development. As the organisers Dr John Danaher and Dr Rónán Kennedy identified: ‘The past decade has seen an explosion in big data analytics and the use  of algorithm-based systems to assist, supplement, or replace human decision-making. This is true in private industry and … Continue reading Algorithmic Governance Workshop (NUI Galway)

Continue reading »

Call For Papers: Rethinking the ABCs

Readers of the blog might be interested in a workshop being organized by Daniel Brown, Eun-Kyeong Kim, Liliana Perez, and Raja Sengupta entitled:

Rethinking the ABCs: Agent-Based Models and Complexity Science in the age of Big Data, CyberGIS, and Sensor networks

September 27th, 2016 in Montreal, Canada

To quote from the call:

“A broad scope of concepts and methodologies from complexity science – including Agent-Based Models, Cellular Automata, network theory, chaos theory, and scaling relations – has contributed to a better understanding of spatial/temporal dynamics of complex geographic patterns and process.

Recent advances in computational technologies such as Big Data, Cloud Computing and CyberGIS platforms, and Sensor Networks (i.e. the Internet of Things) provides both new opportunities and raises new challenges for ABM and complexity theory research within GIScience. Challenges include parameterization of complex models with volumes of georeferenced data being generated, scale model applications to realistic simulations over broader geographic extents, explore the challenges in their deployment across large networks to take advantage of increased computational power, and validate their output using real-time data, as well as measure the impact of the simulation on knowledge, information and decision-making both locally and globally via the world wide web.

The scope of this workshop is to explore novel complexity science approaches to dynamic geographic phenomena and their applications, addressing challenges and enriching research methodologies in geography in a Big Data Era.”

More information about the workshop can be found at https://sites.psu.edu/bigcomplexitygisci/

Continue reading »

Call For Papers: Rethinking the ABCs

Readers of the blog might be interested in a workshop being organized by Daniel Brown, Eun-Kyeong Kim, Liliana Perez, and Raja Sengupta entitled:

Rethinking the ABCs: Agent-Based Models and Complexity Science in the age of Big Data, CyberGIS, and Sensor networks

September 27th, 2016 in Montreal, Canada

To quote from the call:

“A broad scope of concepts and methodologies from complexity science – including Agent-Based Models, Cellular Automata, network theory, chaos theory, and scaling relations – has contributed to a better understanding of spatial/temporal dynamics of complex geographic patterns and process.

Recent advances in computational technologies such as Big Data, Cloud Computing and CyberGIS platforms, and Sensor Networks (i.e. the Internet of Things) provides both new opportunities and raises new challenges for ABM and complexity theory research within GIScience. Challenges include parameterization of complex models with volumes of georeferenced data being generated, scale model applications to realistic simulations over broader geographic extents, explore the challenges in their deployment across large networks to take advantage of increased computational power, and validate their output using real-time data, as well as measure the impact of the simulation on knowledge, information and decision-making both locally and globally via the world wide web.

The scope of this workshop is to explore novel complexity science approaches to dynamic geographic phenomena and their applications, addressing challenges and enriching research methodologies in geography in a Big Data Era.”

More information about the workshop can be found at https://sites.psu.edu/bigcomplexitygisci/

Continue reading »

“Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?

Recently, Alison Heppenstall, Nick Malleson  and myself have just had a paper accepted in Systems entitled: “Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?” In the paper we critically examine how well agent-based models have  simulated a variety of urban processes. We discus what considerations are needed when choosing the appropriate level of spatial analysis and time frame to model urban phenomena and what role Big Data can play in agent-based modeling. Below you can read the abstract of the paper and see a number of example applications discussed.

Abstract: Cities are complex systems, comprising of many interacting parts. How we simulate and understand causality in urban systems is continually evolving. Over the last decade the agent-based modeling (ABM) paradigm has provided a new lens for understanding the effects of interactions of individuals and how through such interactions macro structures emerge, both in the social and physical environment of cities. However, such a paradigm has been hindered due to computational power and a lack of large fine scale datasets. Within the last few years we have witnessed a massive increase in computational processing power and storage, combined with the onset of Big Data. Today geographers find themselves in a data rich era. We now have access to a variety of data sources (e.g., social media, mobile phone data, etc.) that tells us how, and when, individuals are using urban spaces. These data raise several questions: can we effectively use them to understand and model cities as complex entities? How well have ABM approaches lent themselves to simulating the dynamics of urban processes? What has been, or will be, the influence of Big Data on increasing our ability to understand and simulate cities? What is the appropriate level of spatial analysis and time frame to model urban phenomena? Within this paper we discuss these questions using several examples of ABM applied to urban geography to begin a dialogue about the utility of ABM for urban modeling. The arguments that the paper raises are applicable across the wider research environment where researchers are considering using this approach.

Keywords: cities; agent-based modeling; big data; crime; retail; space; simulation

Figure 1. (A) System structure; (B) System hierarchy; and (C) Related subsystems/processes (adapted from Batty, 2013).

Reference cited:

Batty, M. (2013).  The New Science of Cities; MIT Press: Cambridge, MA, USA.

Full reference to the open access paper:

Heppenstall, A., Malleson, N. and Crooks A.T. (2016). “Space, the Final Frontier”: How Good are Agent-based Models at Simulating Individuals and Space in Cities?, Systems, 4(1), 9; doi: 10.3390/systems4010009 (pdf)

 

Continue reading »

“Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?

Recently, Alison Heppenstall, Nick Malleson  and myself have just had a paper accepted in Systems entitled: “Space, the Final Frontier”: How Good are Agent-Based Models at Simulating Individuals and Space in Cities?” In the paper we critically examine how well agent-based models have  simulated a variety of urban processes. We discus what considerations are needed when choosing the appropriate level of spatial analysis and time frame to model urban phenomena and what role Big Data can play in agent-based modeling. Below you can read the abstract of the paper and see a number of example applications discussed.

Abstract: Cities are complex systems, comprising of many interacting parts. How we simulate and understand causality in urban systems is continually evolving. Over the last decade the agent-based modeling (ABM) paradigm has provided a new lens for understanding the effects of interactions of individuals and how through such interactions macro structures emerge, both in the social and physical environment of cities. However, such a paradigm has been hindered due to computational power and a lack of large fine scale datasets. Within the last few years we have witnessed a massive increase in computational processing power and storage, combined with the onset of Big Data. Today geographers find themselves in a data rich era. We now have access to a variety of data sources (e.g., social media, mobile phone data, etc.) that tells us how, and when, individuals are using urban spaces. These data raise several questions: can we effectively use them to understand and model cities as complex entities? How well have ABM approaches lent themselves to simulating the dynamics of urban processes? What has been, or will be, the influence of Big Data on increasing our ability to understand and simulate cities? What is the appropriate level of spatial analysis and time frame to model urban phenomena? Within this paper we discuss these questions using several examples of ABM applied to urban geography to begin a dialogue about the utility of ABM for urban modeling. The arguments that the paper raises are applicable across the wider research environment where researchers are considering using this approach.

Keywords: cities; agent-based modeling; big data; crime; retail; space; simulation

Figure 1. (A) System structure; (B) System hierarchy; and (C) Related subsystems/processes (adapted from Batty, 2013).

Reference cited:

Batty, M. (2013).  The New Science of Cities; MIT Press: Cambridge, MA, USA.

Full reference to the open access paper:

Heppenstall, A., Malleson, N. and Crooks A.T. (2016). “Space, the Final Frontier”: How Good are Agent-based Models at Simulating Individuals and Space in Cities?, Systems, 4(1), 9; doi: 10.3390/systems4010009 (pdf)

 

Continue reading »

Mapping London’s Twitter Activity in 3d

Image 1. The tweet density from 8am to 4pm on 20th June 2015, Central London




Twitter Mapping is increasingly useful method to link virtual activities and geographical space. Geo-tagged data attached to tweets containing the users’ location where they tweeted and it can visualise the locations of users on the map. Although the number of the geo-taggedtweets is a relatively small portion of all tweets, we can figure out the density, spatial patterns and other invisible relationships between online and offline.


Recently, studies with geo-tagged tweets have been developed to analyse the public response tospecific urban events, natural disasters and regional characteristics (Li et al., 2013) [1].  Furthermore, it is extending to traditional urban research topics, for example, revealing spatial segregation and inequality in cities (Shelton et al., 2015) [2].

 

Twitter mapping in 3D can augment 2d visualisation by providing built environment contexts and improved information. There are many examples of Twitter mapping in 3d such as A) #interactive/Andes [3] , B) London’s Twitter Island [4], C) Mapping London in real time, using Tweets [5]. A) and B) build up 3d mountains of the geo-tagged tweet on the map.  In the case of C), when the geo-tagged tweets are sent in the city, the heights of nearest buildings increase in the 3d model. These examples are creative and show different ways to view the integrated environments.

From a Networking City’s view, if we make a Twitter visualisation more tangible in a 3d urban model, it would help us to have a better understanding how urban environments are interconnected with the invisible media flow.

 

To make the visualisation, the Twitter data has been collected by using Big Data Toolkit developed by Steven Gray at CASA, UCL. All 53,750 geo-tagged tweets are collected on 20thJune, 2015 across the UK. As we can see from Table 1, the number of tweets was at the lowest point at 5am and reached to the highest point at 10pm with 3495 tweets. Moreover, Video 1 shows the location of the data in the UK and London on that day in real time.

 


Table 1. The Number of Geo-Coded Tweets in the UK on 20th June, 2015

 

https://www.youtube.com/watch?v=dg-2VlVfFaM



Video 1. The location of Geo-Coded Tweets in the UK on 20th June, 2015



When we calculate the density of the data, London, particularly Central London, contains the largest number of the tweets. (Image 2)

 

 

 

Image 2. The density of Geo-Coded Tweets in the UK on 20th June, 2015

In order to focus on the high density data, 6 km x 3.5 km area of Central London is chosen for the 3d model. Buildings, bridges, roads and other natural environments of the part of London have been set in the model based on OS Building Heights data[6]. Some Google 3d warehouse buildings are added to represent important landmark buildings like St.Pauls, London Eye and Tower Bridge as you can see from Image 3, Image 4 and Image 5.

 

 
Image 3. The plan view of Central London model

Image 4. The perspective view of Central London model

Image 5. The perspective view of Central London model (view from BT Tower)

The geo-tagged data set is divided into one hour periodsand distributed on the map to identify the tweet density in the area. Through this process, we can see how the density is changing depending on the time period. For example, the tweets are mainly concentrated around Piccadilly Circus and Trafalgar Square between 10am and 11am, but  there are two high-density areas between 12pm and 1pm (See Image 6, Image 7, Image 8 and Image 9)

Image 6. The tweet density between 10am and 11am on 20th June 2015

Image 7. The tweet density between 12pm and 1pm on 20th June 2015

Image 8. The tweet density from 12am to 12pm

Image 9. The tweet density from 12pm to Midnight

 


 
As we’ve seen above, the 2d mapping is useful to understand the relative density in one period such as which area is high and which area is low between 12pm and 1pm. However, we cannot understand the degree of intensity in the highest peak areas. It is believed that 3d mapping is needed at this stage. We can clearly see the density of the tweet data in each periodand the intensity of the tweet density across the time periods from Image 10 to Image 14.

West End area shows high density throughout the whole day but City area shows the peak only during lunch time. This pattern likely relates to the activities of office workers in City and leisure/tourist in West End.

Image 10. The tweet density in 3d between 10am and 11am on 20th June 2015

Image 11. The tweet density in 3d between 12pm and 1pm on 20th June 2015

 
Image 12. The tweet density in 3d from 12am to 8pm

Image 13. The tweet density in 3d from 8am to 4pm

Image 14. The tweet density from 4pm to Midnight

 

 

 ________________________________________

[1] Linna Li , Michael F. Goodchild & Bo Xu (2013) Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr, Cartography and Geographic Information Science, 40:2, 61-77

 

[2] Taylor Shelton, Ate Poorthuis & Matthew Zook (2015) Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information, Landscape and Urban Planning (Forthcoming), http://papers.ssrn.com/abstract=2571757

 

[3] Nicolas Belmonte, #interactive/Andes,   http://twitter.github.io/interactive/andes/  (Strived on 15th August 2015)

 

[4] Andy Hudson-Smith, London’s Twitter Island – From ArcGIS to Max to Lumion, http://www.digitalurban.org/2012/01/londons-twitter-island-from-arcgis-to.html#comment-7314


(Strived on 15thAugust 2015)

 
[5] Stephan Hugel and Flora Roumpani, Mapping London in real time, using Tweets, https://www.youtube.com/watch?feature=player_embedded&v=3fk_qxGZWFQ (Strived on 15th August 2015)

[6] OS Building Heights-Digimap Home Page  http://digimap.edina.ac.uk/webhelp/os/data_information/os_products/os_building_heights.htm  (Strived on 15th August 2015)

 

Continue reading »

Mapping London’s Twitter Activity in 3d

Image 1. The tweet density from 8am to 4pm on 20th June 2015, Central London




Twitter Mapping is increasingly useful method to link virtual activities and geographical space. Geo-tagged data attached to tweets containing the users’ location where they tweeted and it can visualise the locations of users on the map. Although the number of the geo-taggedtweets is a relatively small portion of all tweets, we can figure out the density, spatial patterns and other invisible relationships between online and offline.


Recently, studies with geo-tagged tweets have been developed to analyse the public response tospecific urban events, natural disasters and regional characteristics (Li et al., 2013) [1].  Furthermore, it is extending to traditional urban research topics, for example, revealing spatial segregation and inequality in cities (Shelton et al., 2015) [2].

 

Twitter mapping in 3D can augment 2d visualisation by providing built environment contexts and improved information. There are many examples of Twitter mapping in 3d such as A) #interactive/Andes [3] , B) London’s Twitter Island [4], C) Mapping London in real time, using Tweets [5]. A) and B) build up 3d mountains of the geo-tagged tweet on the map.  In the case of C), when the geo-tagged tweets are sent in the city, the heights of nearest buildings increase in the 3d model. These examples are creative and show different ways to view the integrated environments.

From a Networking City’s view, if we make a Twitter visualisation more tangible in a 3d urban model, it would help us to have a better understanding how urban environments are interconnected with the invisible media flow.

 

To make the visualisation, the Twitter data has been collected by using Big Data Toolkit developed by Steven Gray at CASA, UCL. All 53,750 geo-tagged tweets are collected on 20thJune, 2015 across the UK. As we can see from Table 1, the number of tweets was at the lowest point at 5am and reached to the highest point at 10pm with 3495 tweets. Moreover, Video 1 shows the location of the data in the UK and London on that day in real time.

 


Table 1. The Number of Geo-Coded Tweets in the UK on 20th June, 2015

 

https://www.youtube.com/watch?v=dg-2VlVfFaM



Video 1. The location of Geo-Coded Tweets in the UK on 20th June, 2015



When we calculate the density of the data, London, particularly Central London, contains the largest number of the tweets. (Image 2)

 

 

 

Image 2. The density of Geo-Coded Tweets in the UK on 20th June, 2015

In order to focus on the high density data, 6 km x 3.5 km area of Central London is chosen for the 3d model. Buildings, bridges, roads and other natural environments of the part of London have been set in the model based on OS Building Heights data[6]. Some Google 3d warehouse buildings are added to represent important landmark buildings like St.Pauls, London Eye and Tower Bridge as you can see from Image 3, Image 4 and Image 5.

 

 
Image 3. The plan view of Central London model

Image 4. The perspective view of Central London model

Image 5. The perspective view of Central London model (view from BT Tower)

The geo-tagged data set is divided into one hour periodsand distributed on the map to identify the tweet density in the area. Through this process, we can see how the density is changing depending on the time period. For example, the tweets are mainly concentrated around Piccadilly Circus and Trafalgar Square between 10am and 11am, but  there are two high-density areas between 12pm and 1pm (See Image 6, Image 7, Image 8 and Image 9)

Image 6. The tweet density between 10am and 11am on 20th June 2015

Image 7. The tweet density between 12pm and 1pm on 20th June 2015

Image 8. The tweet density from 12am to 12pm

Image 9. The tweet density from 12pm to Midnight

 


 
As we’ve seen above, the 2d mapping is useful to understand the relative density in one period such as which area is high and which area is low between 12pm and 1pm. However, we cannot understand the degree of intensity in the highest peak areas. It is believed that 3d mapping is needed at this stage. We can clearly see the density of the tweet data in each periodand the intensity of the tweet density across the time periods from Image 10 to Image 14.

West End area shows high density throughout the whole day but City area shows the peak only during lunch time. This pattern likely relates to the activities of office workers in City and leisure/tourist in West End.

Image 10. The tweet density in 3d between 10am and 11am on 20th June 2015

Image 11. The tweet density in 3d between 12pm and 1pm on 20th June 2015

 
Image 12. The tweet density in 3d from 12am to 8pm

Image 13. The tweet density in 3d from 8am to 4pm

Image 14. The tweet density from 4pm to Midnight

 

 

 ________________________________________

[1] Linna Li , Michael F. Goodchild & Bo Xu (2013) Spatial, temporal, and socioeconomic patterns in the use of Twitter and Flickr, Cartography and Geographic Information Science, 40:2, 61-77

 

[2] Taylor Shelton, Ate Poorthuis & Matthew Zook (2015) Social Media and the City: Rethinking Urban Socio-Spatial Inequality Using User-Generated Geographic Information, Landscape and Urban Planning (Forthcoming), http://papers.ssrn.com/abstract=2571757

 

[3] Nicolas Belmonte, #interactive/Andes,   http://twitter.github.io/interactive/andes/  (Strived on 15th August 2015)

 

[4] Andy Hudson-Smith, London’s Twitter Island – From ArcGIS to Max to Lumion, http://www.digitalurban.org/2012/01/londons-twitter-island-from-arcgis-to.html#comment-7314


(Strived on 15thAugust 2015)

 
[5] Stephan Hugel and Flora Roumpani, Mapping London in real time, using Tweets, https://www.youtube.com/watch?feature=player_embedded&v=3fk_qxGZWFQ (Strived on 15th August 2015)

[6] OS Building Heights-Digimap Home Page  http://digimap.edina.ac.uk/webhelp/os/data_information/os_products/os_building_heights.htm  (Strived on 15th August 2015)

 

Continue reading »

Data and the City workshop (day 2)

The second day of the Data and City Workshop (here are the notes from day 1) started with the session Data Models and the City. Pouria Amirian started with Service Oriented Design and Polyglot Binding for Efficient Sharing and Analysing of Data in Cities. The starting point is that management of the city need data, and therefore … Continue reading Data and the City workshop (day 2)

Continue reading »

Data and the City workshop (day 1)

The workshop, which is part of the Programmable City project (which is funded by the European Research Council), is held in Maynooth on today and tomorrow. The papers and discussions touched multiple current aspects of technology and the city: Big Data, Open Data, crowdsourcing, and critical studies of data and software. The notes below are … Continue reading Data and the City workshop (day 1)

Continue reading »

Beyond quantification: a role for citizen science and community science in a smart city

The Data and the City workshop will run on the 31st August and 1st September 2015, in Maynooth University, Ireland. It is part of the Programmable City project, led by Prof Rob Kitchin. My contribution to the workshop is titled Beyond quantification: a role for citizen science and community science in a smart city and is extending a short article from … Continue reading Beyond quantification: a role for citizen science and community science in a smart city

Continue reading »
1 2 3 4