Visualising Data

Overview

This week is focussed on formalising our understanding of two things that we’ve already touched on at multiple points throughout the term: 1) how to link data sets; 2) how to tune matplotlib plots to ‘publication quality’ by tweaking fine-grained elements. So this week begins by looking at how to join data together in both spatial and non-spatial contexts, before facing off with the monster that most Python visualisation libraries ultimately depend on: matplotlib. As a tool it is both incredibly powerful and intensely counter-intuitive if you are used to R’s ggplot. We will also be looking more widely to the future of quantitative methods, the potential of a geographic data science, and the ways in which we can move between spatial and non-spatial paradigms of analysis within the same piece of work.

Learning Outcomes
  1. You have formalised your understanding of how to link data in Python.
  2. You have broadened your thinking about the purpose of data visualisation.
  3. You are working intensively on the group project.

Lectures

You are strongly advised to watch these videos on linking data and spatial data, as well how to group data within pandas; however, you will not be asked to present any of these because our attention is now on the final assessments. As well, you should by now be familiar with the concept of how to join data from the GIS module (CASA0005), so this focusses on two things: 1) how to do this in Python; and 2) how to approach this process with large or mismatched data sets.

Session Video Presentation
Linking Data Video Slides
Linking Spatial Data Video Slides
Data Visualisation Video Slides

Other Prep

  • Come to class prepared to briefly present/discuss:
  • Complete the short Moodle quiz associated with this week’s activities.

You may also want to look at the following reports / profiles with a view to thinking about employability and how the skills acquired in this module can be applied beyond the end of your MSc:

  • Geospatial Skills Report <URL>
  • AAG Profile of Nicolas Saravia <URL>
  • Wolf et al. (2021) <URL>
Connections

While I expect most of you will be focussed on assessments, you should seriously consider returning to these readings over the term break as they will help you to reflect on what you’ve learned this term – in this module and across the programme as a whole – in terms of the direction of the field, the opportunities in industry, and the kinds of work that people with (sptial) data science skills can do.

Practical

The practical will lead you through the application of grouping and aggregation functions in pandas, and the fine-tuning of data visualisations in Matplotlib/Seaborn. In many ways, this should be seen (bar the aggregation) as largely a recap of material encountered in previous sessions. However, you should see this as an important step in the production of outputs and analyses needed for the final project. That said, you would be better off spending time on the substance of the report first and only turning to the fine-tuning of the visualisations if time permits.

Connections

The practical focusses on:

  • Linking data as part of a visualisation process.
  • Automating the production of map outputs in Python to create an ‘atlas’.
  • Thinking about proxies; e.g. why would you find ppsqm more/less useful than price? what are you really measuring?

To access the practical:

  1. Preview on GitHub
  2. Download the Notebook

References

Shapiro, W., and M. Yavuz. 2017. Rethinking ’distance’ in New York City.” Medium. https://medium.com/topos-ai/rethinking-distance-in-new-york-city-d17212d24919.
Singleton, Alex, and Daniel Arribas-Bel. 2021. “Geographic Data Science.” Geographical Analysis 53 (1):61–75. https://doi.org/10.1111/gean.12194.
Wolf, Levi John, Sean Fox, Rich Harris, Ron Johnston, Kelvyn Jones, David Manley, Emmanouil Tranos, and Wenfei Winnie Wang. 2021. “Quantitative Geography III: Future Challenges and Challenging Futures.” Progress in Human Geography 45 (3). SAGE Publications Sage UK: London, England:596–608.