Searching Twitter with ArcGIS Pro Using R

I committed to testing this a long time ago, however, a number of other projects intervened, so I have only just got around to writing up this short tutorial. One of the exciting things from the ESRI Developers Conference this year was the launch of the R-ArcGIS bridge. In simple terms, this enables you to run R scripts from within ArcGIS and share data between the software. In fact, this is all explained in a nice interview here.

I won’t go into detail about the R script itself, and the code can be found on github. If I am honest, this is pretty rough, and was written to demonstrate what could be done – that said, it should be usable (I hope… but don’t complain if it isn’t!). ESRI have also provided a nice example which can be found here, and was the basis of my code.

Preparing R

Before you can link ArcGIS Pro to R, you need to install and load the ‘arcgisbinding’ package, which is unfortunately not on CRAN. There are instructions about how to do this here using a Python toolbox; however, I preferred a more manual approach.

Open up R and run the following commands which installs the various packages used by the toolbox. You might also need to install the Rtools utilities as you will be compiling on Windows (available here). Although the TwitteR and httr packages are available on CRAN, for some reason I have been having issues with the latest versions failing to authenticate with Twitter; as such, links to some older versions are provided.

#Install the arcgisbinding package
install.packages("https://4326.us/R/bin/windows/contrib/3.2/arcgisbinding_1.0.0.111.zip", repos=NULL, method="libcurl")

#Install older versions of the TwitteR and httr packages
install.packages("https://cran.r-project.org/src/contrib/Archive/twitteR/twitteR_1.1.8.tar.gz", repos=NULL, method="libcurl")
install.packages("https://cran.r-project.org/src/contrib/Archive/httr/httr_0.6.0.tar.gz", repos=NULL, method="libcurl")

#Load the arcgisbinding package and check license
library(arcgisbinding)
arc.check_product()

Creating a Twitter Search App

Before you can use the Twitter Search Tool in ArcGIS Pro, you first need to register an app with Twitter, which gives you a series of codes that are required to access their API.

  1. Visit https://apps.twitter.com/ and log in with your Twitter username and password.
  2. Click the “Create New App” button where you will need to specify a number of details about the application. I used the following
  3. Name: ArcGIS Pro Example
  4. Description: An application testing R integration with ArcGIS Pro and Twitter
  5. Website: http://www.alex-singleton.com
  6. I left the callback URL blank, then checked the “Yes, I agree” to the developer agreement, and clicked the “Create your Twitter application” button.
  7. On the page that opens, you then need to click on the “Keys and Access Tokens” tab. You need four pieces of information that enable the Toolbox to link up with Twitter. The first two are displayed – “Consumer Key (API Key)” and the “Consumer Secret (API Secret)”. You then need to authorize this application for your account. You do this my clicking the “Create my access token” button at the base of the page. This creates two new codes which are now displayed – “Access Token” and “Access Token Secret”. You now have the 4 codes required to run a Twitter search in ArcGIS Pro.

R Script

I created an R script that:
1. Authenticates a session with Twitter
2. Performs a search query for a user specified term within a proximity (10 miles) of a given lat / lon location
3. Outputs the results as a Shapefile in a folder specified

The inputs to the script include the various access codes, a location, a search term and an output file location. These variables are all fed into the script based on Toolbox inputs. Getting the inputs is relatively simple – they appear in the order that they are added to the Toolbox, and are acquired via in_params[[x]] where x is the order number; thus search_term = in_params[[1]] pulls a search term into a new R object called “search_term”. The basic structure of a script are as follows (code snippet provided by ESRI):

tool_exec <- function(in_params, out_params) {
        # the first input parameter, as a character vector
        input.dataset <- in_params[[1]]
        # alternatively, can access by the parameter name:
        input.dataset <- in_params$input_dataset

        print(input.dataset)
        # ... do analysis steps

        out_params[[1]] <- results.dataset
        return(out_params)
      }

For more details about the functions available in arcgisbinding, see the documentation located here

How to use the Twitter Search Tool

The Twitter Search Tool was run within ArcGIS Pro and requires you to add a new toolbox. The toolbox should be downloaded along with the R script and placed in a folder somewhere on your hard drive. The files can be found on github here.

  1. Open ArcGIS Pro and created a new blank project called Twitter Map.
  2. Create a new map from the insert menu
  3. From the map tab, click the “basemap” button and select the OpenStreetMap tile layer
  4. Zoom into Liverpool on the map using the navigation wheel
  5. Find the latitude and longitude of map centre. These are recorded just under the map on the window border. The centre of Liverpool is approximately -2.95 (longitude), 53.4 (latitude) (although displayed as 002.95W, 53.40N)
  6. Click on the “Insert” menu, the “Toolbox” and then “Add Toolbox” buttons. Navigate to the folder where you have the Toolbox and R script. Click on the Twitter.tbx file and press the “Select” button.
  7. If you don’t see a side bar called “Geoprocessing”, then click on the “Analysis” tab and press the “Tools” button. After this is visible, under the search box there is a “Toolboxes” link. Click this and you will see the Twitter toolbox listed. If you look inside the toolbox you will see the Twitter Search script – click on this to open.
  8. Enter a search term (I used “Beatles” – hey we are in Liverpool), the Twitter authentication details, the location and where you want the output Shapefile stored. This defaults to the geodatabase associated with the project; however, you can browse to a folder and specify a Shapefile name – e.g. Twitter_Beatles.shp.
  9. Press the “Run” button and with luck you should now have a Shapefile created in the folder specified.
  10. Add the results to your map by clicking on the “Map” tab, then the “Add Data” button. Browse to where you saved the Shapefile and click the “Select” button.

The following screenshot is of the Shapefile shown on an OpenStreetMap basemap; with the attribute table also shown – you will see that the full Tweet details are displayed as attributes associated with each point.

pbec

Anyway, I hope this is of use and can assist people getting started linking R to ArcGIS.

Continue reading »

Geodemographics – A Practical Primer

Geocomputation

My new book (co-edited with Chris Brunsdon) is now out.

Many thanks to all the chapter authors for their hard work; and if not arrived already, a very brightly coloured book should be in the post!

Thanks also go to Sage for a really nice production job.

About the book…

Geocomputation is the intersection of advanced computational methods and geographical analysis and modelling. Geocomputation is applied and often interdisciplinary, with methodological developments typically embedded in applications seeking to address real world problems.

Geocomputation excels as a framework for researching many contemporary social science problems associated with large volumes of dynamic and spatio-temporal ‘big data’, such as those generated in ‘smart city’ contexts or from crowdsourcing.
This text:

  1. provides a selection of practical examples of geocomputation techniques and ‘hot topics’ written by world leading practitioners
  2. Integrates selected supporting materials, such as code and data so that readers can work through some examples themselves
  3. Chapters provide highly applied and practical discussions of: Visualisation and exploratory spatial data analysis / space time modelling / spatial algorithms / spatial regression and statistics / open geographic information systems and science
  4. All chapters are uniform in design, and each includes an introduction, case study and conclusion – drawing together both the generalities of the chapter topic and illustration through the case study application. Guidance for further reading is also provided.This accessible text, published in full colour, has been specifically designed for those readers who are new to Geocomputation as an area of research, showing how complex real-world problems can be solved through the integration of technology, data, and geocomputational methods. This is the key primer for applied Geocomputation in the social sciences.
Continue reading »

GIScience Research Group

The GIScRG is a group of academics and practitioners interested in promoting GIScience and GITechnology in geographical research, teaching and the workplace. We  also support and promote e-science and the application of novel computing and spatial analysis paradigms to geographical systems, for example, agent-based modelling. We are currently sponsoring a number of sessions at the  … Continue reading GIScience Research Group

Continue reading »

Quantitative Geography as a Profession

The following case studies demonstrate: the widespread use of quantitative methods and GIS in the workplace and how quantitative skills can be employed to produce excellent student work across the discipline. Case Studies… Risk Insurance GIS and Mapping Environmental Sector Environmental Consultancy Humanitarian Sector Local Government Financial Sector Student Work Featured Image: https://www.flickr.com/photos/churkinms/2582615161/sizes/o/ 

Continue reading »

Transport Map Book

Transport Map BookThe Transport Map Books are available for each local authority district in England and present a series of maps related to commuting behaviour. The data are derived from multiple sources including: the 2011 Census, Department for Transport estimates and the results of a research project looking at carbon dioxide emissions linked to the school commute.

All the maps are available to download HERE; and the R code used to create them and the emissions model is on Github.

Travel to work flows

Travel to work flows
These data relate to Middle Layer Super Output Area (MSOA) level estimates of travel to work flows by transport mode. The raw data are available from the ONS. For the maps, the flows have been limited to those both originating and terminating within each local authority district.

Accessibility to Services

Accessibility to Services
The Department of Transport provide a range of statistics at Lower Layer Super Output Area level about accessibility and connectivity to a series of key services. A subset of variables were mapped.

Emissions associated with the school commute

Emissions
These data were generated as part of an ESRC funded project investigating emissions associated with the school commute. The model provides an estimate of the carbon dioxide emitted at Lower Layer Super Output Area level. For full details of the methodology, see the open access paper:

Singleton, A. (2013) A GIS Approach to Modelling CO2 Emissions Associated with the Pupil-School Commute. International Journal of Geographical Information Science, 28(2):256–273.

Car availability and travel to work mode choice

Car
These attributes were extracted from the 2011 census data provided by Nomis at Output Area level.

Distance and mode of travel to work

Distance
Workplace zones are a new geography for the 2011 census for the dissemination of daytime population statistics. A number of attributes were selected related to transport, and also were downloaded from Nomis.

Continue reading »

Temporal OAC

As part of an ESRC Secondary Data Analysis Initiative grant Michail Pavlis, Paul Longley and I have been working on developing methods by which temporal patterns of geodemographic change can be modelled.

Much of this work has been focused on census based classifications, such as the 2001 Output Area Classification (OAC), and the 2011 OAC released today. We have been particularly interested in examining methods by which secondary data might be used to create measures enabling the screening of small areas over time as uncertainty builds as a result of residential structure change. The writeup of this work is currently out for review, however, we have placed the census based classification created for the years 2001 – 2011 on the new public.cdrc.ac.uk website, along with a change measure.

Some findings

  • 8 Clusters were found to be of greatest utility for the description of OA change between 2001 and 2011 and included
    • Cluster 1- “Suburban Diversity”
    • Cluster 2- “Ethnicity Central”
    • Cluster 3- “Intermediate Areas”
    • Cluster 4- “Students and Aspiring Professionals”
    • Cluster 5- “County Living and Retirement”
    • Cluster 6- “Blue-collar Suburbanites”
    • Cluster 7- “Professional Prosperity”
    • Cluster 8 – “Hard-up Households”

A map of the clusters in 2001 and 2011 for Leeds are as follows:

  • The changing cluster assignment between 2001 and 2011 reflected
    • Developing “Suburban Diversity”
    • Gentrification of central areas, leading to growing “Students and Aspiring Professionals”
    • Regional variations
      • “Ethnicity Central” more stable between 2001 and 2011 in the South East and London, than in the North West and North East, perhaps reflecting differing structural changes in central areas (e.g. gentrification)
      • “Hard-up Households” are more stable in the North West and North East than the South East or London; South East, and acutely so in London, flows were predominantly towards “Suburban Diversity”

Continue reading »

Census Open Atlas Project Version Two

CensusAtlasThis time last year I published the first version of the 2011 Census Open Atlas which comprised Output Area Level census maps for each local authority district. This turned out to be quite a popular project, and I have also extended this to Japan.

The methods used to construct the atlases have now been refined, so each atlas is built from a series of PDF pairs comprising a map and a legend. These are generated for each of the census variable (where appropriate), with the layout handled by Latex. As with demonstrated in the Japan atlas, this also gives the advantage of enabling a table of contents and better description for each map.

Some other changes in version two include:

  • Labels added to the legends
  • Scale bars added
  • Addition of the Welsh only census variables
  • Removal of overly dense labels

When the original project was picked up by the Guardian I made an estimate of the actual number of maps created, however, for this run, I counted them. In total 134,567 maps were created.

Download the maps

The maps can be downloaded from github; and again, the code used to create the maps is here (feel free to fix my code!).

Automated Savings

A manual map might typically take 5 minutes to create – thus:

  • 5 minutes X 134,567 maps = 672,835 minutes
  • 672,835 minutes / 60 = 11,213.9 hours
  • 11,213.9 hours / 24 = 467.2 days (no breaks!)

So, if you take a 35 hour working week for 46 weeks of a year (6 weeks holiday), this equates to 1,610 hours of map making time per year. As such, finishing 134,567 maps would take 6.9 years (11,213.9 / 1,610).

This would obviously be a very boring job; however, it would also be expensive. If we take the median wages of a GIS Technician at £20,030 then the “cost” of all these maps would be 6.9 X £20,030 = £138,207. This toy example does illustrate how learning to code can help save significant money, and indeed what a useful tool R is for spatial analysis.

Continue reading »

Why Geographers Should Learn to Code

This article is published in the January 2014 issue of Geographical Magazine – page 77.

In my opinion, a geography curriculum should require students to learn how to code, ensuring that they’re equipped for a changed job market that’s increasingly detached from geographic information systems (GIS) as they were originally conceived.

The ability to code relates to basic programming and database skills that enable students to manipulate large and small geographic data sets, and to analyse them in automated and transparent ways. Although it might seem odd for a geographer to want to learn programming languages, we only have to look at geography curriculums from the 1980s to realise that these skills used to be taught. For example, it wouldn’t have been unusual for an undergraduate geographer to learn how to programme a basic statistical model (for example, regression) from base principles in Fortran (a programming language popular at the time) as part of a methods course. But during the 1990s, the popularisation of graphical user interfaces in software design enabled many statistical, spatial analysis and mapping operations to be wrapped up within visual and menu-driven interfaces, which were designed to lower the barriers of entry for users of these techniques. Gradually, much GIS teaching has transformed into learning how these software systems operate, albeit within a framework of geographic information science (GISc) concerned with the social and ethical considerations of building representations from geographic data. Some Masters degrees in GISc still require students to code, but few undergraduate courses do so.

The good news is that it’s never been more exciting to be a geographer. Huge volumes of spatial data about how the world looks and functions are being collected and disseminated. However, translating such data safely into useful information is a complex task. During the past ten years, there has been an explosion in new platforms through which geographic data can be processed and visualised. For example, the advent of services such as Google Maps has made it easier for people to create geographical representations online. However, both the analysis of large volumes of data and the use of these new methods of representation or analysis do require some level of basic programming ability. Furthermore, many of these developments have not been led by geographers, and there is a real danger that our skill set will be seen as superfluous to these activities in the future without some level of intervention. Indeed, it’s a sobering experience to look through the pages of job advertisements for GIS-type roles in the UK and internationally. Whereas these might once have required knowledge of a particular software package, they increasingly look like advertisements for computer scientists, with expected skills and experience that wouldn’t traditionally be part of an undergraduate geography curriculum.

Many of the problems that GIS set out to address can now be addressed with mainstream software or shared online services that are, as such, much easier to use. If I want to determine the most efficient route between two locations, a simple website query can give a response within seconds, accounting for live traffic-volume data. If I want to view the distribution of a census attribute over a given area, there are multiple free services that offer street-level mapping. Such tasks used to be far more complex, involving specialist software and technical skills. There are now far fewer job advertisements for GIS technicians than there were ten years ago. Much traditional GIS-type analysis is now sufficiently non-technical that it requires little specialist skill, or has been automated through software services, with a subscription replacing the employment of a technician. The market has moved on!

Geographers shouldn’t become computer scientists; however, we need to reassert our role in the development and critique of existing and new GIS. For example, we need to ask questions such as which type of geographic representation might be most appropriate for a given dataset. Today’s geographers may be able to talk in general terms about such a question, but they need to be able to provide a more effective answer that encapsulates the technologies that are used for display. Understanding what is and isn’t possible in technical terms is as important as understanding the underlying cartographic principles. Such insights will be more available to a geographer who has learnt how to code.

Within the area of GIS, technological change has accelerated at an alarming rate in the past decade and geography curriculums need to ensure that they embrace these developments. This does, however, come with challenges. Academics must ensure that they are up to date with market developments and also that there’s sufficient capacity within the system to make up-skilling possible.Prospective geography undergraduates should also consider how the university curriculums have adapted to modern market conditions and whether they offer the opportunity to learn how to code.

Continue reading »
1 2 3 4