Setting Up

Author

Jon Reades

Published

September 12, 2025

Overview

In the first week we will be focussing on the supporting infrastructure for ‘doing data science’. That is to say, we’ll be dealing with the installation and configuration of tools such as GitHub and Docker which support replicable, shareable, and document-able data science. As a (free) bonus, the use of these tools also protects you against catastrophic (or the merely irritating) data loss thanks to over-zealous editing of code or content. You should see this as preparing the foundation not only for your remaining CASA modules (especially those in Term 2) but also for your post-MSc career.

Learning objectives
  1. A basic understanding of the data science ‘pipeline’.
  2. An understanding of how data scientists use a wide range of ‘tools’ to do data science.
  3. A completed installation/configuration of these tools.

If you missed the Induction Week ‘install fest’, please now complete as many of these activities as you can:

  1. Go through the computer health check.
  2. Install the base utilities.
  3. Install the programming environment.

The last of these is the stage where you’re most likely to encounter problems that will need our assistance, so knowing that you need our help in Week 1 means that you can ask for it much sooner in the practical!

Readings

Please make time to read:

Citation Article
Arribas-Bel and Reades (2018) URL

Study Guide

The following questions will help guide your reading and prepare you for class discussions:

  1. Drawing on Arribas-Bel and Reades (2018), compare and contrast GIS, Geocomputation, and Geographical Data Science (GDS):
  • What are their core focuses and methodological approaches?
  • How do they differ in their relationship to technological change?
  • What are the unique contributions of GDS in the context of “big data” and the rise of data science?
  1. Still drawing on Arribas-Bel and Reades (2018), consider the role of technological determinism in the evolution of geographical thought:
  • Do technological advancements determine the direction of geographical inquiry?
  • How do the authors characterize the relationship between technological change and the development of geographical thought?
  • What evidence do they provide to support their view?

In-Person Lectures

In this week’s workshop we will review the module aims, learning outcomes, and expectations with a general introduction to the course.

Session Video Presentation
Getting Started In Class Slides
Computers in Urban Studies In Class Slides
Principles In Class Slides
Tools of the Trade In Class Slides

Practical

This week’s practical is focussed on getting you set up with the tools and accounts that you’ll need to across many of the CASA modules in Terms 1 and 2, and familiarising you with ‘how people do data science’. Outside of academia, it’s rare to find a data scientist who works entirely on their own: most code is collaborative, as is most analysis! But collaborating effectively requires tools that: get out of the way of doing ‘stuff’; support teams in negotating conflicts in code; make it easy to share results; and make it easy to ensure that everyone is ‘on the same page’.

Connections

The practical focusses on:

  • Getting you up and running with the coding and collaboration tools.
  • Providing you with hands-on experience of using these tools.
  • Configuring your programming environment for the rest of the programme.
Note

To save a copy of notebook to your own GitHub Repo: follow the GitHub link, click on Raw and then Save File As... to save it to your own computer. Make sure to change the extension from .ipynb.txt (which will probably be the default) to .ipynbbefore adding the file to your GitHub repository.

To access the practical:

  1. Preview
  2. Download

References

Arribas-Bel, D., and J. Reades. 2018. “Geography and Computers: Past, Present, and Future.” Geography Compass 12 (e12403). https://doi.org/10.1111/gec3.12403.