Foundations (Pt. 2)
Overview
This week we will dig into data (lists and dictionaries) in greater detail so that you understand how we design structures to store and organise data to simplify our analysis. We will also be looking to the Unix Shell/Terminal as a ‘power user feature’ that is often overlooked by novice data scientists.
- To see how ‘simple’ concepts can be (re)combined to tackle complex problems.
- An introduction to making use of Git+GitHub.
- An introdution to making use of the Shell/Terminal.
This week we also start to move beyond Code Camp, so although you should recognise many of the parts that we discuss, you’ll see that we begin to put them together in a new way. The next two weeks are a critical transition between content that you might have seen before in Code Camp (see Practical) or other introductory materials, and the ‘data science’ approach.
Preparatory Lectures
Come to class prepared to present/discuss:
Session | Video | Presentation |
---|---|---|
Dictionaries | Video | Slides |
LOLs | Video | Notes |
DOLs to Data | Video | Slides |
The Command Line | Video | Slides |
Getting Stuck into Git | Video | Slides |
Other Preparation
Readings
Come to class prepared to discuss the following readings:
Citation | Article | ChatGPT Summary |
---|---|---|
Donoho (2017) | URL | Summary |
Franklin (2024) | URL | Summary |
Travers, Sims, and Bosetti (2016) | URL | N/A |
Study Guide
The following questions will help guide your reading and prepare you for class discussions:
- Thinking about Donoho (2017)’s definition of data science…
- What is the “Big Data” meme and why does Donoho find it misleading?
- Explain Breiman’s concept of the “two cultures” in data analysis.
- What does Tukey argue are the three essential constituents of a science, and how does he apply them to the field of data analysis?
- What are CTFs and how do they relate to the “predictive culture” of data analysis?
- What are the six divisions of the Greater Data Science (GDS) framework proposed by Donoho (2017)?
- According to Franklin (2024)…
- What are three recent developments that distinguish the current landscape of open software in quantitative methods from previous eras?
- What are some of the emerging subfields within quantitative methods, and what common goal do they share?
- Franklin argues that the rebranding of quantitative methods as “data science” or “analytics” can be beneficial in certain contexts. Explain these benefits.
- What concerns are raised about the fragmentation of the quantitative methods community identity?
- What is the main argument presented in “Housing and Inequality in London” by Travers, Sims, and Bosetti (2016)?
- How does the report define “affordable housing” and what are the key challenges facing London’s housing market?
- What are the main drivers of inequality in London, and how do they intersect with housing issues?
- What policy recommendations are proposed to address the housing crisis in London?
Franklin (2024) offers another perspective on ‘the discipline’ of quantitative human geography and its heterogeneity, while Donoho (2017) will give you context on how data science might differ from what’s covered in Quantitative Methods. You might also find Unwin (1980) useful for understanding why the practicals are set up the way they are and why we don’t post ‘answers’ until a few days after the last practical group has completed its session.
Practical
This week’s practical will take you through the use of dictionaries and introduce the concept of ‘nested’ data structures. We’ll also be looking at how functions (and variables) can be collected into resuable packages that we can either make ourselves or draw on a worldwide bank of experts – I know who I’d rather depend on when the opportunity arises! However, if you have not yet completed Code Camp (or were not aware of it!), then you will benefit enormously from tackling the following sessions:
The practical focusses on:
- Comparing the use of Python lists and dictionaries to store tabular data.
- Extending lists and dictionaries into nested data structures.
To access the practical: