Data and the City workshop (day 1)

The workshop, which is part of the Programmable City project (which is funded by the European Research Council), is held in Maynooth on today and tomorrow. The papers and discussions touched multiple current aspects of technology and the city: Big Data, Open Data, crowdsourcing, and critical studies of data and software. The notes below are focusing on aspects that are relevant to Volunteered Geographic Information (VGI), Citizen Science and participatory sensing – aspects of Big Data/Open data are noted more briefly.

Rob Kitchin opened with a talk to frame the workshop, highlighting the history of city data (see his paper on which the talk is based). We are witnessing a transformation from data-informed cities to data-driven cities. Within these data streams we can include Big Data, official data, sensors, drones and other sources. The sources also include volunteered information such as social media, mapping, and citizen science. Cities are becoming instrumented and networked and the data is assembled through urban informatics (focusing on interaction and visualisation) and urban science (which focus on modelling and analysis( . There is a lot of critique – with relations to data, there are questions about the politics of urban data, corporatisation of governance, the use of buggy, brittle and hackable urban systems, and social and ethical aspects.  Examples to these issues include politics: accepting that data is not value free or objective and influenced by organisations with specific interest and goals. Another issue is the corporatisation of data, with questions about data ownership and data control. Further issues of data security and data integrity when systems are buggy and brittle – there have been cases of hacking into a city systems already. Social, Political, and ethical aspects include data protection and privacy, dataveillance/surveillance, social sorting through algorithms, control creep, dynamic pricing and anticipatory governance (expecting someone to be a criminal). There are also technical questions: coverage, integration between systems, data quality and governance (and the communication of information about quality), and skills and organisational capabilities to deal with the data.
The workshop is to think critically about the data, and asking questions on how this data is constructed and run.

The talk by Jim Thatcher & Craig Dalton – explored provenance models of data. A core question is how to demonstrate that data is what is saying it is and where it came from. In particular, they consider how provenance applies to urban data. There is an epistemological leap from an individual (person) to a data point(s) – per person there can be up to 1500 data attribute per person in corporate database. City governance require more provenance in information than commercial imperatives. They suggest that data user and producers need to be aware of the data and how it is used.

Evelyn Ruppert asked where are the data citizens? Discuss the politics in data, and thinking about the people as subjects in data – seeing people as actors who are intentional and political in their acts of creating data. Being digital mediates between people and technology and what they do. There are myriad forms of subjectivation – there are issues of rights and how people exercise these rights. Being a digital citizens – there is not just recipient of rights but also the ability to take and assert rights. She used the concept of cyberspace as it is useful for understanding rights of the people who use it, while being careful about what it means. There is conflation of cyberspace and the Internet and failures to see it as completely separate space. She sees Cyberspace is the set of relations and engagements that are happening over the Internet. She referred to her recent book ‘Being Digital Citizens‘. Cyberspace has relationships to real space – in relations to Lefebvre concepts of space. She use speech-act theory that explore the ability to act through saying things, and there is a theoretical possibility of performativity in speech. We are not in command of what will happen with speech and what will be the act. We can assert acts through the things we do, and not only in the thing we say and that’s what is happening with how people use the Internet and construct cyberspace.

Jo Bates talked about data cultures and power in the city. Starting from hierarchy in dat and information. Data can be thought as ‘alleged evidence’ (Buckland) – data can be thought as material, they are specific things – data have dimensionality, weight and texture and it is existing something. Cox, in 1981, view the relationship between ideas, institutions and material capabilities – and the tensions between them – institutions are being seen as stabilising force compare to ideas and material capabilities, although the institutions may be outdated. She noted that sites of data cultures are historically constituted but also dynamic and porous – but need to look at who participate and how data move.

The session followed by a discussion, some of the issues: I’ve raised the point of the impact of methodological individualism on Evelyn and Jim analysis – for Evelyn, the digital citizenship is for collectives, and for Jim, the provenance and use of devices is done as part of collectives and data cultures. Jo explored the idea of “progressive data culture” and suggested that we don’t understand what are the conditions for it yet – the inclusive, participatory culture is not there. For Evelyn, data is only possible through the action of people who are involved in its making, and the private ownership of this data does not necessarily make sense in the long run. Regarding hybrid space view of cyberspace/urban spaces – they are overlapping and it is not helpful to try and separate them. Progressive data cultures require organisational change at government and other organisations. Tracey asked about work on indigenous data, and the way it is owned by the collective – and  noted that there are examples in the arctic with a whole setup for changing practices towards traditional and local knowledge. The provenance goes all the way to the community, the Arctic Spatial Data Infrastructure there are lots of issues with integrating indigenous knowledge into the general data culture of the system. The discussion ended with exploration of the special case of urban/rural – noting to the code/space nature of agricultural spaces, such as the remote control of John Deere tractors, use of precision agriculture, control over space (so people can’t get into it), tagged livestock as well as variable access to the Internet, speed of broadband etc.

The second session looked at Data Infrastructure and platforms, starting with Till Straube who looked at Situating Data Infrastructure. He highlighted that Git (GitHub) blurs the lines between code and data, which is also in functional programming – code is data and data is code. He also looked at software or conceptual technology stacks, and hardware is at the bottom. He therefore use the concept of topology from Science and Technology Studies and Actor-Network Theory to understand the interactions.

Tracey Lauriaultontologizing the city – her research looked at the transition of Ordnance Survey Ireland (OSi) with their core GIS – the move towards object-oriented and rules based database. How is the city translated into data and how the code influence the city? She looked at OSi, and the way it produce the data for the island, and providing infrastructure for other bodies (infrastructure). OSi started as colonial projects, and moved from cartographical maps and digital data model to a full object-oriented structure. The change is about understanding and conceptualising the mapping process. The ontology is what are the things that are important for OSi to record and encode – and the way in which the new model allows to reconceptualise space – she had access to a lot of information about the engineering, tendering and implementation process, and also follow some specific places in Dublin. She explore her analysis methods and the problems of trying to understand how the process work even when you have access to information.

The discussion that follows explored the concept of ‘stack’ but also ideas of considering the stack at planetary scale. The stack is pervading other ways of thinking – stack is more than a metaphor: it’s a way of thinking about IT development, but it can be flatten. It gets people to think how things are inter-relations between different parts. Tracey: it is difficult to separate the different parts of the system because there is so much interconnection. Evelyn suggested that we can think about the way maps were assembled and for what purpose, and understanding how the new system is aiming to give certain outcomes. To which Tracey responded that the system moved from a map to a database, Ian Hacking approach to classification system need to be tweaked to make it relevant and effective for understanding systems like the one that she’s exploring. The discussion expanded to questions about how large systems are developed and what methodologies can be used to create systems that can deal with urban data, including discussion of software engineering approaches, organisational and people change over time, ‘war stories’ of building and implementing different systems, etc.

The third and last session was about data analytics and the city – although the content wasn’t exactly that!

Gavin McArdle covered his and Rob Kitchin paper on the veracity of open and real-time urban data. He highlighted the value of open data – from claims of transparency and enlighten citizens to very large estimation of the business value. Yet, while data portals are opening in many cities, there are issues with the veracity of the data – metadata is not provided along the data. He covered spatial data quality indicators from ISO, ICA and transport systems, but questioned if the typical standard for data are relevant in the context of urban data, and maybe need to reconsider how to record it. By looking at 2 case studies, he demonstrated that data is problematic (e.g. indicating travel in the city of 6km in 30 sec). Communicating the changes in the data to other users is an issue, as well as getting information from the data providers – maybe possible to have meta-data catalogue that add information about a dataset and explanation on how to report veracity. There are facilities in Paris and Washington DC, but they are not used extensively

Next, Chris Speed talked about blockchain city – spatial, social and cognitive ledgers, exploring the potential of distributed recording of information as a way to create all forms of markets in information that can be controlled by different actors.

I have closed the session with a talk that is based on my paper for the workshop, and the slides are available below.

The discussion that followed explored aspects of representation and noise (produced by people who are monitored, instruments or ‘dirty’ open data), and some clarification of the link between the citizen science part and the philosophy of technology part of my talk – highlighting that Borgmann use of ‘natural’,’cultural’ and ‘technological’ information should not be confused with the everyday use of these words.