Zika in Twitter: Health Narratives

In the paper we explored how health narratives and event storylines pertaining to the recent Zika outbreak emerged in social media and how it related to news stories and actual events.

Specifically we combined actors (e.g. twitter uses), locations (e.g. where the tweets originated) and concepts (e.g. emerging narratives such as pregnancy) to gain insights on the mechanisms that drive participation, contributions, and interactions on social media  during a disease outbreak. Below you can read a summary of our paper along with some of the figures which highlight our methodology and findings.  

An overview of the Twitter narrative analysis approach, starting with data collection, and proceeding with preprocessing and data analysis to identify narrative events, which can be used to build an event storyline.

Abstract:
 

Background: The recent Zika outbreak witnessed the disease evolving from a regional health concern to a global epidemic. During this process, different communities across the globe became involved in Twitter, discussing the disease and key issues associated with it. This paper presents a study of this discussion in Twitter, at the nexus of location, actors, and concepts.

Objective: Our objective in this study was to demonstrate the significance of 3 types of events: location related, actor related, and concept- related for understanding how a public health emergency of international concern plays out in social media, and Twitter in particular. Accordingly, the study contributes to research efforts toward gaining insights on the mechanisms that drive participation, contributions, and interaction in this social media platform during a disease outbreak. 

Methods: We collected 6,249,626 tweets referring to the Zika outbreak over a period of 12 weeks early in the outbreak (December 2015 through March 2016). We analyzed this data corpus in terms of its geographical footprint, the actors participating in the discourse, and emerging concepts associated with the issue. Data were visualized and evaluated with spatiotemporal and network analysis tools to capture the evolution of interest on the topic and to reveal connections between locations, actors, and concepts in the form of interaction networks. 

Results: The spatiotemporal analysis of Twitter contributions reflects the spread of interest in Zika from its original hotspot in South America to North America and then across the globe. The Centers for Disease Control and World Health Organization had a prominent presence in social media discussions. Tweets about pregnancy and abortion increased as more information about this emerging infectious disease was presented to the public and public figures became involved in this. 

Conclusions: The results of this study show the utility of analyzing temporal variations in the analytic triad of locations, actors, and concepts. This contributes to advancing our understanding of social media discourse during a public health emergency of international concern.

Keywords: Zika Virus; Social Media; Twitter Messaging; Geographic Information Systems.

Spatiotemporal participation patterns and identifiable clusters over 4 of our twelve week study. The top left panel shows the data during the first week, and time progresses from left to right and from top to bottom towards .

Subsets of the full retweet network pertaining to the WHO (left) and CDC (right), and clusters identified within them. Magenta clusters are centered upon health entities, green upon news organizations, orange upon political entities.

Visualizing a narrative storyline across locations (blue), actors (red), and concepts (green).

Full Reference:

Stefanidis, A., Vraga, E., Lamprianidis, G., Radzikowski, J., Delamater, P.L., Jacobsen, K.H., Pfoser, D., Croitoru, A. and Crooks, A.T. (2017). “Zika in Twitter: Temporal Variations of Locations, Actors, and Concepts”, JMIR Public Health and Surveillance, 3 (2): e22. (pdf)

As normal, any feedback or comments are most welcome. 

Continue reading »

Zika in Twitter: Health Narratives

In the paper we explored how health narratives and event storylines pertaining to the recent Zika outbreak emerged in social media and how it related to news stories and actual events.

Specifically we combined actors (e.g. twitter uses), locations (e.g. where the tweets originated) and concepts (e.g. emerging narratives such as pregnancy) to gain insights on the mechanisms that drive participation, contributions, and interactions on social media  during a disease outbreak. Below you can read a summary of our paper along with some of the figures which highlight our methodology and findings.  

An overview of the Twitter narrative analysis approach, starting with data collection, and proceeding with preprocessing and data analysis to identify narrative events, which can be used to build an event storyline.

Abstract:
 

Background: The recent Zika outbreak witnessed the disease evolving from a regional health concern to a global epidemic. During this process, different communities across the globe became involved in Twitter, discussing the disease and key issues associated with it. This paper presents a study of this discussion in Twitter, at the nexus of location, actors, and concepts.

Objective: Our objective in this study was to demonstrate the significance of 3 types of events: location related, actor related, and concept- related for understanding how a public health emergency of international concern plays out in social media, and Twitter in particular. Accordingly, the study contributes to research efforts toward gaining insights on the mechanisms that drive participation, contributions, and interaction in this social media platform during a disease outbreak. 

Methods: We collected 6,249,626 tweets referring to the Zika outbreak over a period of 12 weeks early in the outbreak (December 2015 through March 2016). We analyzed this data corpus in terms of its geographical footprint, the actors participating in the discourse, and emerging concepts associated with the issue. Data were visualized and evaluated with spatiotemporal and network analysis tools to capture the evolution of interest on the topic and to reveal connections between locations, actors, and concepts in the form of interaction networks. 

Results: The spatiotemporal analysis of Twitter contributions reflects the spread of interest in Zika from its original hotspot in South America to North America and then across the globe. The Centers for Disease Control and World Health Organization had a prominent presence in social media discussions. Tweets about pregnancy and abortion increased as more information about this emerging infectious disease was presented to the public and public figures became involved in this. 

Conclusions: The results of this study show the utility of analyzing temporal variations in the analytic triad of locations, actors, and concepts. This contributes to advancing our understanding of social media discourse during a public health emergency of international concern.

Keywords: Zika Virus; Social Media; Twitter Messaging; Geographic Information Systems.

Spatiotemporal participation patterns and identifiable clusters over 4 of our twelve week study. The top left panel shows the data during the first week, and time progresses from left to right and from top to bottom towards .

Subsets of the full retweet network pertaining to the WHO (left) and CDC (right), and clusters identified within them. Magenta clusters are centered upon health entities, green upon news organizations, orange upon political entities.

Visualizing a narrative storyline across locations (blue), actors (red), and concepts (green).

Full Reference:

Stefanidis, A., Vraga, E., Lamprianidis, G., Radzikowski, J., Delamater, P.L., Jacobsen, K.H., Pfoser, D., Croitoru, A. and Crooks, A.T. (2017). “Zika in Twitter: Temporal Variations of Locations, Actors, and Concepts”, JMIR Public Health and Surveillance, 3 (2): e22. (pdf)

As normal, any feedback or comments are most welcome. 

Continue reading »

New Paper: User-Generated Big Data and Urban Morphology

Continuing our work with crowdsourcing and geosocial analysis we recently had a paper published in a special issue of the  Built Environment journal entitled “User-Generated Big Data and Urban Morphology.”
The theme of the special issue is: “Big Data and the City” which was guest edited by Mike Batty and includes 12 papers.  To quote from the website

“This cutting edge special issue responds to the latest digital revolution, setting out the state of the art of the new technologies around so-called Big Data, critically examining the hyperbole surrounding smartness and other claims, and relating it to age-old urban challenges. Big data is everywhere, largely generated by automated systems operating in real time that potentially tell us how cities are performing and changing. A product of the smart city, it is providing us with novel data sets that suggest ways in which we might plan better, and design more sustainable environments. The articles in this issue tell us how scientists and planners are using big data to better understand everything from new forms of mobility in transport systems to new uses of social media. Together, they reveal how visualization is fast becoming an integral part of developing a thorough understanding of our cities.”

Table of Contents

In our paper we discuss and show how crowdsourced data is leading to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics. Specifically how such data can provide us information pertaining to linked spaces and geosocial neighborhoods. We argue that a geosocial neighborhood is not defined by its administrative boundaries, planning zones, or physical barriers, but rather by its emergence as an organic self-organized social construct that is embedded in geographical spaces that are linked by human activity. Below is the abstract of the paper and some of the figures we have in it which showcase our work.

“Traditionally urban morphology has been the study of cities as human habitats through the analysis of their tangible, physical artefacts. Such artefacts are outcomes of complex social and economic forces, and their study is primarily driven by traditional modes of data collection (e.g. based on censuses, physical surveys, and mapping). The emergence of Web 2.0 and through its applications, platforms and mechanisms that foster user-generated contributions to be made, disseminated, and debated in cyberspace, is providing a new lens in the study of urban morphology. In this paper, we showcase ways in which user-generated ‘big data’ can be harvested and analyzed to generate snapshots and impressionistic views of the urban landscape in physical terms. We discuss and support through representative examples the potential of such analysis in revealing how urban spaces are perceived by the general public, establishing links between tangible artefacts and cyber-social elements. These links may be in the form of references to, observations about, or events that enrich and move beyond the traditional physical characteristics of various locations. This leads to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics.”

Keywords: Urban Morphology, Social Media, GeoSocial, Cities, Big Data.

City Infoscapes – Fusing Data from Physical (L1, L2), Social, Perceptual (L3) Spaces to Derive Place Abstractions (L4) for Different Locations (N1, N2).
Recreational Hotspots Composed of “Locals” and “Tourists” with Perceived Artifacts Indicating “Use” and “Need”. (A) High Line Park (B) Madison Square Garden.



Moving from Spatial Neighborhoods to Geosocial Neighborhoods via Links.

The Emergence of Geosocial Neighborhoods after the in the
Aftermath of the 2013 Boston Marathon Bombing

Full  Reference: 

Crooks, A.T., Croitoru, A., Jenkins, A., Mahabir, R., Agouris, P. and Stefanidis A. (2016). “User-Generated Big Data and Urban Morphology,”  Built Environment, 42 (3): 396-414. (pdf)

Continue reading »

New Paper: User-Generated Big Data and Urban Morphology

Continuing our work with crowdsourcing and geosocial analysis we recently had a paper published in a special issue of the  Built Environment journal entitled “User-Generated Big Data and Urban Morphology.”
The theme of the special issue is: “Big Data and the City” which was guest edited by Mike Batty and includes 12 papers.  To quote from the website

“This cutting edge special issue responds to the latest digital revolution, setting out the state of the art of the new technologies around so-called Big Data, critically examining the hyperbole surrounding smartness and other claims, and relating it to age-old urban challenges. Big data is everywhere, largely generated by automated systems operating in real time that potentially tell us how cities are performing and changing. A product of the smart city, it is providing us with novel data sets that suggest ways in which we might plan better, and design more sustainable environments. The articles in this issue tell us how scientists and planners are using big data to better understand everything from new forms of mobility in transport systems to new uses of social media. Together, they reveal how visualization is fast becoming an integral part of developing a thorough understanding of our cities.”

Table of Contents

In our paper we discuss and show how crowdsourced data is leading to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics. Specifically how such data can provide us information pertaining to linked spaces and geosocial neighborhoods. We argue that a geosocial neighborhood is not defined by its administrative boundaries, planning zones, or physical barriers, but rather by its emergence as an organic self-organized social construct that is embedded in geographical spaces that are linked by human activity. Below is the abstract of the paper and some of the figures we have in it which showcase our work.

“Traditionally urban morphology has been the study of cities as human habitats through the analysis of their tangible, physical artefacts. Such artefacts are outcomes of complex social and economic forces, and their study is primarily driven by traditional modes of data collection (e.g. based on censuses, physical surveys, and mapping). The emergence of Web 2.0 and through its applications, platforms and mechanisms that foster user-generated contributions to be made, disseminated, and debated in cyberspace, is providing a new lens in the study of urban morphology. In this paper, we showcase ways in which user-generated ‘big data’ can be harvested and analyzed to generate snapshots and impressionistic views of the urban landscape in physical terms. We discuss and support through representative examples the potential of such analysis in revealing how urban spaces are perceived by the general public, establishing links between tangible artefacts and cyber-social elements. These links may be in the form of references to, observations about, or events that enrich and move beyond the traditional physical characteristics of various locations. This leads to the emergence of alternate views of urban morphology that better capture the intricate nature of urban environments and their dynamics.”

Keywords: Urban Morphology, Social Media, GeoSocial, Cities, Big Data.

City Infoscapes – Fusing Data from Physical (L1, L2), Social, Perceptual (L3) Spaces to Derive Place Abstractions (L4) for Different Locations (N1, N2).
Recreational Hotspots Composed of “Locals” and “Tourists” with Perceived Artifacts Indicating “Use” and “Need”. (A) High Line Park (B) Madison Square Garden.



Moving from Spatial Neighborhoods to Geosocial Neighborhoods via Links.

The Emergence of Geosocial Neighborhoods after the in the
Aftermath of the 2013 Boston Marathon Bombing

Full  Reference: 

Crooks, A.T., Croitoru, A., Jenkins, A., Mahabir, R., Agouris, P. and Stefanidis A. (2016). “User-Generated Big Data and Urban Morphology,”  Built Environment, 42 (3): 396-414. (pdf)

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.

“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

Summer Projects

Over the summer, Arie Croitoru and myself took part in the George Mason University Aspiring Scientists Summer Internship Program. We worked with three very talented high-school students who over the course of the seven and a half week program produced some excellent research around the areas of agent-based modeling and social media analysis. An overview of their work can be seen in the posters and abstracts that the students produced at the end of the internship.
Lawrence Wang explored how social media could be used with respect to predicting election results under a project entitled “And the Winner Is? Predicting Election Results using Social Media”. Below you can read Lawrence’s abstract and see his poster.

“The 2012 U.S. presidential election demonstrated how Twitter can serve as a widely accessible forum of political discourse. Recently, researchers have investigated whether social media, particularly Twitter, can function as a predictive tool. In the past decade, multiple studies have claimed to successfully predict the results of elections using Twitter data. However, many of these studies fail to account for the inherent population bias present in Twitter data, leading to ungeneralizable results. In this project, I investigate the prospects of using Twitter data as an alternative to poll data for predicting the 2012 presidential election. The tweet corpus consisted of tweets published one month before the November election day. Using VADER, a sentiment analysis tool, I analyzed over 140,000 tweets for political sentiment. I attempted to circumvent the Twitter population bias by comparing age, race, and gender metrics of the Twitter population with that of the U.S. population. Furthermore, I utilized Bayesian inference with prior distributions from the results of the 2008 presidential election in order to mitigate the effects of limited tweet data in certain states. The resulting model correctly predicted the likely outcomes of 46 of the 50 states and predicted that President Obama would be reelected with a probability of 0.945. Such a model could be used to explore the forthcoming elections. ” 

In a second project, Varun Talwar, explored how knowledge bases could be utilized to better contextualize social media discussions with a project entitled “Context Graphs: A Knowledge-Driven Model for Contextualizing Twitter Discourse.” Below you can read Varun’s project abstract and his end of project poster.

Introduction: User posted content through online social media (SM) platforms in recent years has emerged as a rich field for narrative analysis of topics captured during the discussion discourse. In particular, collective discourse has been used to manually contextualize public perception of health related events.

Objective: As SM feeds tend to be noisy, automated detection of the context of a given SM discourse stream has proven to be a challenging task. The primary objective of this research is to explore how existing knowledge bases could be utilized to better contextualize SM discussions through topic modeling and mining. By utilizing such existing knowledge it would then be possible to explore to what extent a given discourse is related to a known or a new context, as well as compare and contrast SM discussions through their respective contexts.

Methods: In order to accomplish these goals this research proposes a novel approach for contextualizing SM discourse. In this approach, topic modeling is combined with a knowledgebase in a two-step process. First, key topics are extracted from a SM data corpus by applying a statistical topic-modeling algorithm, a process that also results in data dimensionality reduction. Once a set of salient topics are extracted, each topic is then used to mine the knowledge base for sub graphs that represent the contextual linkages between knowledge elements. Such sub-graphs can then further disambiguate the topic modeling results, and be utilized for qualifying context similarity across SM discussions.

Results: The time-series analysis of the Twitter discourse via graph-matching algorithms reveals the change in topics as evidenced by the emergence of the terms “pregnancy” and “abortion” as information about the virus propagated through the Twitter community. “

Elizabeth Hu explored the current migration crisis in Europe in a project entitled “Across the Sea: A Novel Agent-Based Model for the Migratory Patterns of the European Refugee Crisis”. Below is Elizabeth’s abstract, poster and an example model run.

“Since 2010, a growing number of refugees have sought asylum in European nations, fleeing violence and military conflict in their home countries. Most of the refugees originate from Syria, Iraq, Afghanistan, and African nations. The vast majority of refugees risk their lives in the popular yet perilous Mediterranean Sea Route often prone to boat accidents and subsequent deaths of migrants.  The flow of millions of refugees has introduced a humanitarian crisis not seen since World War II. European nations are struggling to cope with the influx of refugees through various border policies.

In order to explore this crisis, a geographically explicit agent-based model has been developed to study the past and future patterns of refugee flows. Traditional migration models, which represent the population as an aggregate, fail to consider individual decision-making processes based on personal status and intervening opportunities. However, the novel agent-based model developed here of migration allows population behavior to emerge as the result of individual decisions. Initial population, city, and route attributes are based upon data from the UNHCR, EU agencies, crowd-sourced databases, and news articles. The agents, refugees, select goal destinations in accordance with the Law of Intervening Opportunities. Thus, goals are prone to change with fluctuating personal needs. Agents choose routes not only based on distance, but also other relevant route attributes. The resulting migration flows generated by the model under various circumstances could provide crucial guidance for policy and humanitarian aid decisions.”

The movie below gives a sense of the migration paths the refugees are taking.

Continue reading »

Megacities through the Lens of Social Media

Megacities, which can be roughly defined as cities with a population of over 10 million people are on the increase due to ongoing urbanization trends. The United Nations notes that since the 1970’s the number of megacities has more than tripled (from 8 to 34), and is expected to further double until 2050 (to exceed 60).

The question we are wondering is how can GeoSocial analysis help understand such cities. To this end, we have recently had a paper published  entitled: “Megacities: Through the Lens of Social Media” in the Journal of the Homeland Defense and Security Information Analysis Center (HDIAC). In the paper we discuss opportunities and challenges that social media brings with respect to understanding the physical and cyber spaces within megacities. Below you can see the synopsis to our paper.

Due to ongoing urbanization trends the worldwide urban population is projected to grow from half of the global population (today) to two thirds of it by 2030. Almost all the new megacities that will emerge through this process are in geopolitical hotspots of southeast Asia and sub-Saharan Africa. Therefore, the U.S. Department of Defense must consider the challenges presented by engagement in such environments when planning for the future. The physical challenge of operating in such dense, highly three-dimensional, environments is only compounded by the added challenge presented by the advanced functional complexity of these environments: megacities function at the intersection of the physical, social, and cyber spaces. Accordingly, military operations in these locations must prepare to engage in environments where news, ideas, and opinions are shaped in cyberspace and propagated across the physical urban landscape. As social networks connect (or, often, divide) populations they form communities and facilitate their mobilization.

We have observed these processes time and again, from the streets of Cairo during the Arab Spring, to the streets of Tokyo during the Fukushima nuclear disaster, and the streets of Paris during the recent ISIL terrorist attacks. Advancing our capability to analyze crowd-generated content in the form of social media feeds is a substantial scientific challenge with considerable implications for future DoD operations. In this publication, we use representative examples to demonstrate the opportunities and challenges associated with such information, especially as they relate to large urban areas. 

An emerging framework to study urban systems.

Social networks embedded within a geographical content, leading to connected, non-contiguous areas.

Full Reference: 

Stefanidis, A., Jenkins A., Croitoru, A. and Crooks, A. (2016). “Megacities Through the Lens of Social Media”, Journal of the Homeland Defense & Security Information Analysis Center (HDIAC), 3(1): 24-29. (pdf)

Continue reading »

Megacities through the Lens of Social Media

Megacities, which can be roughly defined as cities with a population of over 10 million people are on the increase due to ongoing urbanization trends. The United Nations notes that since the 1970’s the number of megacities has more than tripled (from 8 to 34), and is expected to further double until 2050 (to exceed 60).

The question we are wondering is how can GeoSocial analysis help understand such cities. To this end, we have recently had a paper published  entitled: “Megacities: Through the Lens of Social Media” in the Journal of the Homeland Defense and Security Information Analysis Center (HDIAC). In the paper we discuss opportunities and challenges that social media brings with respect to understanding the physical and cyber spaces within megacities. Below you can see the synopsis to our paper.

Due to ongoing urbanization trends the worldwide urban population is projected to grow from half of the global population (today) to two thirds of it by 2030. Almost all the new megacities that will emerge through this process are in geopolitical hotspots of southeast Asia and sub-Saharan Africa. Therefore, the U.S. Department of Defense must consider the challenges presented by engagement in such environments when planning for the future. The physical challenge of operating in such dense, highly three-dimensional, environments is only compounded by the added challenge presented by the advanced functional complexity of these environments: megacities function at the intersection of the physical, social, and cyber spaces. Accordingly, military operations in these locations must prepare to engage in environments where news, ideas, and opinions are shaped in cyberspace and propagated across the physical urban landscape. As social networks connect (or, often, divide) populations they form communities and facilitate their mobilization.

We have observed these processes time and again, from the streets of Cairo during the Arab Spring, to the streets of Tokyo during the Fukushima nuclear disaster, and the streets of Paris during the recent ISIL terrorist attacks. Advancing our capability to analyze crowd-generated content in the form of social media feeds is a substantial scientific challenge with considerable implications for future DoD operations. In this publication, we use representative examples to demonstrate the opportunities and challenges associated with such information, especially as they relate to large urban areas. 

An emerging framework to study urban systems.

Social networks embedded within a geographical content, leading to connected, non-contiguous areas.

Full Reference: 

Stefanidis, A., Jenkins A., Croitoru, A. and Crooks, A. (2016). “Megacities Through the Lens of Social Media”, Journal of the Homeland Defense & Security Information Analysis Center (HDIAC), 3(1): 24-29. (pdf)

Continue reading »

Measles Vaccination Narrative in Twitter

A summary of our approach
Continuing our work with respects to GeoSocial analysis we have recently published a paper in JMIR Public Health and Surveillance entitled “The Measles Vaccination Narrative in Twitter: A Quantitative Analysis“. In this paper we explore how social media can be quantitatively studied to explore the narrative behind measles vaccinations. Below you can read the abstract to the paper which includes the background to why we chose to study this topic, the study objective, our methodology, a summary of our results and conclusions. 

Background: The emergence of social media is providing an alternative avenue for information exchange and opinion formation on health-related issues. Collective discourse in such media leads to the formation of a complex narrative, conveying public views and perceptions.

Objective: This paper presents a study of Twitter narrative regarding vaccination in the aftermath of the 2015 measles outbreak, both in terms of its cyber and physical characteristics. The contributions of this work are the analysis of the data for this particular study, as well as presenting a quantitative interdisciplinary approach to analyze such open-source data in the context of health narratives.

Methods: 669,136 tweets were collected in the period February 1 through March 9, 2015 referring to vaccination. These tweets were analyzed to identify key terms, connections among such terms, retweet patterns, the structure of the narrative, and connections to the geographical space.

Results: The data analysis captures the anatomy of the themes and relations that make up the discussion about vaccination in Twitter. The results highlight the higher impact of stories contributed by news organizations compared to direct tweets by health organizations in communicating health-related information. They also capture the structure of the anti-vaccination narrative and its terms of reference. Analysis also revealed the relationship between community engagement in Twitter and state policies regarding child vaccination. Residents of Vermont and Oregon, the two states with the highest rates of non-medical exemption from school-entry vaccines nationwide, are leading the social media discussion in terms of participation.

Conclusions: The interdisciplinary study of health-related debates in social media across the cyber-physical debate nexus leads to a greater understanding of public concerns, views, and responses to health-related issues. Further coalescing such capabilities shows promise towards advancing health communication, supporting the design of more effective strategies that take into account the complex and evolving public views of health issues.

Global distribution of tweets in our data corpus
The paper is open access and can be viewed and downloaded from here.
Full reference:

Radzikowski, J., Stefanidis, A., Jacobsen K.H., Croitoru, A., Crooks, A.T. and Delamater, P.L. (2016). “The Measles Vaccination Narrative in Twitter: A Quantitative Analysis”, JMIR Public Health and Surveillance, 2(1):e1. 

Hashtag associations: clustering based on co-occurrences of hashtags in individual tweets
Continue reading »

Measles Vaccination Narrative in Twitter

A summary of our approach
Continuing our work with respects to GeoSocial analysis we have recently published a paper in JMIR Public Health and Surveillance entitled “The Measles Vaccination Narrative in Twitter: A Quantitative Analysis“. In this paper we explore how social media can be quantitatively studied to explore the narrative behind measles vaccinations. Below you can read the abstract to the paper which includes the background to why we chose to study this topic, the study objective, our methodology, a summary of our results and conclusions. 

Background: The emergence of social media is providing an alternative avenue for information exchange and opinion formation on health-related issues. Collective discourse in such media leads to the formation of a complex narrative, conveying public views and perceptions.

Objective: This paper presents a study of Twitter narrative regarding vaccination in the aftermath of the 2015 measles outbreak, both in terms of its cyber and physical characteristics. The contributions of this work are the analysis of the data for this particular study, as well as presenting a quantitative interdisciplinary approach to analyze such open-source data in the context of health narratives.

Methods: 669,136 tweets were collected in the period February 1 through March 9, 2015 referring to vaccination. These tweets were analyzed to identify key terms, connections among such terms, retweet patterns, the structure of the narrative, and connections to the geographical space.

Results: The data analysis captures the anatomy of the themes and relations that make up the discussion about vaccination in Twitter. The results highlight the higher impact of stories contributed by news organizations compared to direct tweets by health organizations in communicating health-related information. They also capture the structure of the anti-vaccination narrative and its terms of reference. Analysis also revealed the relationship between community engagement in Twitter and state policies regarding child vaccination. Residents of Vermont and Oregon, the two states with the highest rates of non-medical exemption from school-entry vaccines nationwide, are leading the social media discussion in terms of participation.

Conclusions: The interdisciplinary study of health-related debates in social media across the cyber-physical debate nexus leads to a greater understanding of public concerns, views, and responses to health-related issues. Further coalescing such capabilities shows promise towards advancing health communication, supporting the design of more effective strategies that take into account the complex and evolving public views of health issues.

Global distribution of tweets in our data corpus
The paper is open access and can be viewed and downloaded from here.
Full reference:

Radzikowski, J., Stefanidis, A., Jacobsen K.H., Croitoru, A., Crooks, A.T. and Delamater, P.L. (2016). “The Measles Vaccination Narrative in Twitter: A Quantitative Analysis”, JMIR Public Health and Surveillance, 2(1):e1. 

Hashtag associations: clustering based on co-occurrences of hashtags in individual tweets
Continue reading »

Linking Cyber and Physical Spaces

We have just published a new paper in  Computers, Environment and Urban Systems entitled “Linking Cyber and Physical Spaces Through Community Detection And Clustering in Social Media Feeds“. In the paper we explore how geosocial media is providing us with  a new social communication avenue and a novel source of geosocial information. 
In particular, we discuss the notion of physical presence within social media and its importance for exploring the relation between the cyber and the physical domains. We discuss how communities and groups can be detected in both the cyber and physical space, and how they can be processed to form a ‘hybrid’ geosocial view of communities using social network analysis, community detection (the Louvain method) and DenStream. To showcase these concepts and their benefits, we present the analysis of two case studies that make use of Twitter data associated with two different types of events: a planned activity during the Occupy Wall Street (OWS) Day of Action (November 17th, 2011), and the response to the Boston Marathon Bombing (April 15, 2013). We conclude with a summary and outlook. Below is the abstract of the paper:

Over the last decade we have witnessed a significant growth in the use of social media. Interactions within their context lead to the establishment of groups that function at the intersection of the physical and cyber spaces, and as such represent hybrid communities. Gaining a better understanding of how information flows in these hybrid communities is a substantial scientific challenge with significant implications on our ability to better harness crowd-contributed content. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we advance our understanding of the information propagation mechanisms in social media. The utility of this approach is demonstrated in two real-world case studies, the first reflecting a planned event (the Occupy Wall Street – OWS – movement’s Day of Action in November 2011), and the second reflecting an unexpected disaster (the Boston Marathon bombing in April 2013). Our findings highlight the intricate nature of the propagation and evolution of information both within and across cyber and physical spaces, as well as the role of hybrid networks in the exchange of information between these spaces.

Research highlights include:

    • Our analysis includes two major events as captured in Twitter.
    • The themes in cyber and physical communities tend to converge over time.
    • Messages among physical space users are more consistent at the onset of the event.
    • Geolocated users are consuming information more than they produce.

      Below are some of the images from the paper. Specifically the first image is how one can think of the relationships between physical and cyber spaces.  The next image provides an overview Our geosocial analysis framework for examining cyber and physical communities.

      Our Geosocial analysis framework

      In the figure below we show an example of using DenStream for spatiotemporal clustering and how the process can capture the protest activities that were planned for the Occupy Wall Street movement’s Day of Action. Each dot corresponds to the originating location of a geolocated tweet; The color of each point indicates the time of the corresponding tweet, ranging from dark blue (early morning, 0) to dark red (late night, 1). While the circles represent a specific spatiotemporal cluster. For example the circle labeled A marked the start of the day where people congregated around Wall Street while circle labeled C shows a cluster at Foley Square.
      Physical space groups identified in the lower Manhattan area. Each dot corresponds to the originating location of a geolocated tweet; The color of each point indicates the time of the corresponding tweet, ranging from dark blue (early morning, 0) to dark red (late night, 1).
      While in the figure below we show one example of linking the cyber and physical communities. Specifically in (a), the top five communities (node degree > 100) in the cyber space retweet network (each community is designated by one color) are shown; (b) shows the physical space groups; and (c) shows the resulting  hybrid meta-network where the connections between physical groups (P nodes), and cyber space communities (C nodes) are shown.

      We hope you enjoy the paper.

      Full Reference:

      Croitoru, A., Wayant, N., Crooks, A.T., Radzikowski, J. and Stefanidis, A. (2014), Linking Cyber and Physical Spaces Through Community Detection And Clustering in Social Media Feeds, Computers, Environment and Urban Systemsdoi:10.1016/j.compenvurbsys.2014.11.002

      Continue reading »

      Linking Cyber and Physical Spaces

      We have just published a new paper in  Computers, Environment and Urban Systems entitled “Linking Cyber and Physical Spaces Through Community Detection And Clustering in Social Media Feeds“. In the paper we explore how geosocial media is providing us with  a new social communication avenue and a novel source of geosocial information. 
      In particular, we discuss the notion of physical presence within social media and its importance for exploring the relation between the cyber and the physical domains. We discuss how communities and groups can be detected in both the cyber and physical space, and how they can be processed to form a ‘hybrid’ geosocial view of communities using social network analysis, community detection (the Louvain method) and DenStream. To showcase these concepts and their benefits, we present the analysis of two case studies that make use of Twitter data associated with two different types of events: a planned activity during the Occupy Wall Street (OWS) Day of Action (November 17th, 2011), and the response to the Boston Marathon Bombing (April 15, 2013). We conclude with a summary and outlook. Below is the abstract of the paper:

      Over the last decade we have witnessed a significant growth in the use of social media. Interactions within their context lead to the establishment of groups that function at the intersection of the physical and cyber spaces, and as such represent hybrid communities. Gaining a better understanding of how information flows in these hybrid communities is a substantial scientific challenge with significant implications on our ability to better harness crowd-contributed content. This paper addresses this challenge by studying how information propagates and evolves over time at the intersection of the physical and cyber spaces. By analyzing the spatial footprint, social network structure, and content in both physical and cyber spaces we advance our understanding of the information propagation mechanisms in social media. The utility of this approach is demonstrated in two real-world case studies, the first reflecting a planned event (the Occupy Wall Street – OWS – movement’s Day of Action in November 2011), and the second reflecting an unexpected disaster (the Boston Marathon bombing in April 2013). Our findings highlight the intricate nature of the propagation and evolution of information both within and across cyber and physical spaces, as well as the role of hybrid networks in the exchange of information between these spaces.

      Research highlights include:

        • Our analysis includes two major events as captured in Twitter.
        • The themes in cyber and physical communities tend to converge over time.
        • Messages among physical space users are more consistent at the onset of the event.
        • Geolocated users are consuming information more than they produce.

          Below are some of the images from the paper. Specifically the first image is how one can think of the relationships between physical and cyber spaces.  The next image provides an overview Our geosocial analysis framework for examining cyber and physical communities.

          Our Geosocial analysis framework

          In the figure below we show an example of using DenStream for spatiotemporal clustering and how the process can capture the protest activities that were planned for the Occupy Wall Street movement’s Day of Action. Each dot corresponds to the originating location of a geolocated tweet; The color of each point indicates the time of the corresponding tweet, ranging from dark blue (early morning, 0) to dark red (late night, 1). While the circles represent a specific spatiotemporal cluster. For example the circle labeled A marked the start of the day where people congregated around Wall Street while circle labeled C shows a cluster at Foley Square.
          Physical space groups identified in the lower Manhattan area. Each dot corresponds to the originating location of a geolocated tweet; The color of each point indicates the time of the corresponding tweet, ranging from dark blue (early morning, 0) to dark red (late night, 1).
          While in the figure below we show one example of linking the cyber and physical communities. Specifically in (a), the top five communities (node degree > 100) in the cyber space retweet network (each community is designated by one color) are shown; (b) shows the physical space groups; and (c) shows the resulting  hybrid meta-network where the connections between physical groups (P nodes), and cyber space communities (C nodes) are shown.

          We hope you enjoy the paper.

          Full Reference:

          Croitoru, A., Wayant, N., Crooks, A.T., Radzikowski, J. and Stefanidis, A. (2014), Linking Cyber and Physical Spaces Through Community Detection And Clustering in Social Media Feeds, Computers, Environment and Urban Systemsdoi:10.1016/j.compenvurbsys.2014.11.002

          Continue reading »

          IR: State-Driven and Citizen-Driven Networks

          Our work exploring how social media can be used to study events around the world has resulted in a new publication in the  Social Science Computer Review entitled “International Relations: State-Driven and Citizen-Driven Networks.” In essence what we are attempting to do is compare traditional international relations (e.g. from the United Nations General Assembly voting patterns) to those arising from the bottom up interactions (i.e from people on the ground). The abstract of the paper is below along with some of the images that accompany the paper.

          The international community can be viewed as a set of networks, manifested through various transnational activities. The availability of longitudinal datasets such as international arms trades and United Nations General Assembly (UNGA) allows for the study of state-driven interactions over time. In parallel to this top-down approach, the recent emergence of social media is fostering a bottom-up and citizen driven avenue for international relations (IR). The comparison of these two network types offers a new lens to study the alignment between states and their people. This paper presents a network-driven approach to analyze communities as they are established through different forms of bottom-up (e.g. Twitter) and top-down (e.g. UNGA voting records and international arms trade records) IR. By constructing and comparing different network communities we were able to evaluate the similarities between state-driven and citizen-driven networks. In order to validate our approach we identified communities in UNGA voting records during and after the Cold War. Our approach showed that the similarity between UNGA communities during and after the Cold War was 0.55 and 0.81 respectively (in a 0-1 scale). To explore the state- versus citizen-driven interactions we focused on the recent events within Syria within Twitter over a sample period of one month. The analysis of these data show a clear misalignment (0.25) between citizen-formed international networks and the ones established by the Syrian government (e.g. through its UNGA voting patterns).

          Full reference:

          Crooks, A.T., Masad, D., Croitoru, A., Cotnoir, A., Stefanidis, A. and Radzikowski, J. (2013), International Relations: State-Driven and Citizen-Driven Networks, Social Science Computer Review. DOI:10.1177/0894439313506851

          If you don’t have access to Social Science Computer Review, send us an email and we can send you an early version of the paper. This is also only part of our work on using multiple networks to explore international relations. One can of course also explore the networks in more detail. For example in the figure below we plot the actual transfer of arms between states during the 2001 and 2011 period. One can clearly see how different states are connected with Syria however, Russia has connections to many states.

          Arms transfers
          Or if we explore Twitter hastags and add an edge between any pair of hashtags when they are used in the same tweet we can explore an emergent ontology of topic labels users associate with each other. For example, the #Allepo hashtag is associated with other hashtags which appear to local events, including “#civilian”, “#airstrike”, “#hunger”, “#pictures”, many of which are only connected to the #Aleppo hashtag as shown below.

          Continue reading »

          IR: State-Driven and Citizen-Driven Networks

          Our work exploring how social media can be used to study events around the world has resulted in a new publication in the  Social Science Computer Review entitled “International Relations: State-Driven and Citizen-Driven Networks.” In essence what we are attempting to do is compare traditional international relations (e.g. from the United Nations General Assembly voting patterns) to those arising from the bottom up interactions (i.e from people on the ground). The abstract of the paper is below along with some of the images that accompany the paper.

          The international community can be viewed as a set of networks, manifested through various transnational activities. The availability of longitudinal datasets such as international arms trades and United Nations General Assembly (UNGA) allows for the study of state-driven interactions over time. In parallel to this top-down approach, the recent emergence of social media is fostering a bottom-up and citizen driven avenue for international relations (IR). The comparison of these two network types offers a new lens to study the alignment between states and their people. This paper presents a network-driven approach to analyze communities as they are established through different forms of bottom-up (e.g. Twitter) and top-down (e.g. UNGA voting records and international arms trade records) IR. By constructing and comparing different network communities we were able to evaluate the similarities between state-driven and citizen-driven networks. In order to validate our approach we identified communities in UNGA voting records during and after the Cold War. Our approach showed that the similarity between UNGA communities during and after the Cold War was 0.55 and 0.81 respectively (in a 0-1 scale). To explore the state- versus citizen-driven interactions we focused on the recent events within Syria within Twitter over a sample period of one month. The analysis of these data show a clear misalignment (0.25) between citizen-formed international networks and the ones established by the Syrian government (e.g. through its UNGA voting patterns).

          Full reference:

          Crooks, A.T., Masad, D., Croitoru, A., Cotnoir, A., Stefanidis, A. and Radzikowski, J. (2013), International Relations: State-Driven and Citizen-Driven Networks, Social Science Computer Review. DOI:10.1177/0894439313506851

          If you don’t have access to Social Science Computer Review, send us an email and we can send you an early version of the paper. This is also only part of our work on using multiple networks to explore international relations. One can of course also explore the networks in more detail. For example in the figure below we plot the actual transfer of arms between states during the 2001 and 2011 period. One can clearly see how different states are connected with Syria however, Russia has connections to many states.

          Arms transfers
          Or if we explore Twitter hastags and add an edge between any pair of hashtags when they are used in the same tweet we can explore an emergent ontology of topic labels users associate with each other. For example, the #Allepo hashtag is associated with other hashtags which appear to local events, including “#civilian”, “#airstrike”, “#hunger”, “#pictures”, many of which are only connected to the #Aleppo hashtag as shown below.

          Continue reading »
          1 2