Background: Why We Do What We Do

Computational approaches to understanding and managing cities – which is to say, approaches using computers and code – differ in important ways from the quantitative skills taught in traditional geography ‘methods’ classes: computational geography is underpinned by algorithms that employ concepts such as iteration and recursion, and we use these to tackle everything from a data processing problem to an entire research question.

For example, Alex Singleton’s OpenAtlas contains 134,567 maps. Alex designed and wrote a script to iterate over the Census areas (i.e. to ‘visit’ each area in turn when creating a map), and to recurse into smaller sub-regions from larger regions (i.e. to keep drilling down into smaller and smaller geographies) in order to generate maps at, literally, every conceivable scale. Then he let the computer do the ‘boring bit’ of actually creating each and every map.

Thinking Algorithmically

You should learn how to program a computer because it teaches you how to think.

There are many good reasons for you to learn to code, but let’s start with some good general reasons why you should learn to program a computer even if you never use it to make a map or complete a bit of spatial analysis.

Thinking algorithmically requires students and professionals to deal with abstraction: we don’t want to define how each analysis should work – or how each map should look – rather, we want to specify a set of rules about how to select and display data on a map, and then let the computer make them all for us. In this way of working it’s not really any more work to create 500 or 5,000 maps than it is to create 5 because we’ve already told the computer how to make useful maps.

Here’s another way to think about it:

An algorithm is like a recipe. It takes “inputs” (the ingredients), performs a set of simple and (hopefully) well-defined steps, and then terminates after producing an “output” (the meal).

This article also goes on to make some interesting points about AI and deep learning that are well worth a read, but for our purposes, the bit about it being a recipe is the important part: how would you break your problem down into steps like the ones you’d see in a recipe? Do A. Do B. If B doesn’t work, try C. If that doesn’t work, stop. If it does, then jump to Step F…

Learning to think this way is hard work: the first time I try a new recipe I really don’t know how things are going to taste. Similarly, the first time I use an algorithm to make a map or solve a problem I usually don’t actually know exactly how my maps are going to look until after I’ve made them. The difference from the ‘normal’, non-computational way of working is that I make a few changes to my code and then just run it again. And again… as many times as I need to in order to get what I want. I can keep changing the recipe until I get it just right.

Thinking Like a Programmer

However, trying the same recipe again and again and again also sounds like hard work! Wouldn’t it be faster to just click and let someone else do the work? Like when we don’t have the energy or skills to cook and end up orderubg something from Just Eat or, to impress our friends, HelloFresh? Similarly, isn’t it easier to just point-and-click in SPSS or ArcMap? Well, yes and no. There are two core advantages to working with code over pointing-and-clicking: your solution is transferrable, and the approach that you are developing to problem-solving also transfers very nicely to the ‘real world’ of employment.

Why do we say this?

  1. Programming solutions are transferrable because you aren’t just solving one problem, you are solving classes of problems. In the same way that many recipes build on the same basic ingredients (sometimes adding something new for increased ‘spice’), many applications use the same basic ingredients: it’s how they’re put together in new ways that lead to new outputs. It’s also a lot like Lego.
  2. Thinking like a programmer also translates well because you are learning to deal with abstraction. Yes, the details of a problem matter (just as ignoring cultural differences between two countries can matter), but it’s important to be able to break a really big, messy, complex problem down into smaller, tidier, more comprehensible bits that you can tackle. Programmers deal with this every day, so they tend to develop important skills in understanding and dealing with practical challenges of the sort that you’ll face every day in your career.

Here’s another useful bit of insight:

The best way [of solving problems] involves a) having a framework and b) practising it.

Problem-solving skills are almost unanimously the most important qualification that employers look for… more than programming languages proficiency, debugging, and system design…

— Hacker Rank (2018 Developer Skills Report)

You really should read the article (it’s not very long) but here are the key points:

  1. Understand the problem – most problems are hard because you don’t understand them and you will only know that you’ve understood it when you can explain it in plain-English.
  2. Plan – if you just dive in without thinking about what you need your code to take in and spit out then you’re going to waste a lot of time.
  3. Divide – break a hard problem down into simple, small steps and tackle them in small, simple blocks of code. Never try to just sit down and ‘code’.
  4. Debug with a fresh eye – if you really feel stuck, step away from the computer for 5 minutes, take a deep breath, and try to look at the problem with a fresh set of eyes rather than just diving back in. Most problems boil down to either not seeing the big picture, or not realising that the computer is doing exactly what you told it to do, and not what you meant for it to do.
  5. Practice – find ways to practice problem-solving and coding (not necessarily at the same time).

If you don’t take our word for it, how about taking Richard Feynman’s word on it?

If you can’t explain something in simple terms, you don’t understand it.

Back in the ‘olden days’ of blackboards

The Benefits of Coding?

In a practical context we think that the benefits of learning to code fall into three categories:

  1. Flexibility: a computer can often apply the same analytical process to a completely different data set (e.g. rainfall in UK vs rainfall in the US) with minimal effort compared to trying to do each step manually in, say, Excel or SPSS. For students, it comes down to this: if you discover a newer, better data set halfway through your dissertation and want to use this for your analysis instead of the old, inaccurate data, it’s a lot easier and faster to update your analyses if you have used code to do the analysis to-date!
  2. Reproducibility: recently, it’s been discovered that a lot of research cannot be reproduced. In other words, if one scientist tries to duplicate what someone else did in order to check something out (as is important in the scientific method) they’re finding that the results don’t line up. So a second example of why coding your data analysis for a dissertation: you’ve just finished your analysis when someone points out that you made a mistake with the data right back at the beginning; redoing all of that in Excel or SPSS would be a nightmare, but with code, it can be as easy as changing one line and hitting ‘Run’!
  3. Scalability: a computer doesn’t care if you throw 10 lines or 10 billion lines at it, the only thing that changes is how long it takes to get an answer. In other words, if your code ‘works’ on a subset of your data it should also work on your entire data set no matter how big it is. This is also a good way to develop code: rather than try to read in the whole data set in one go while you’re still trying to understand it, take a few rows and make sure you’re handling those ones correctly (and if what you see squares with what you were told) before expanding to larger and larger subsets.

Often, the payoff for coding the answer to a problem instead of just clicking through the options in SPSS or Arc can seem a long way away. It’s like learning a new language: you spend a lot of time asking directions to the train station or whether someone had a nice breakfast before you can start work on the novel or the business case. But the payoff is there if you stick with it!

Being a ‘good’ programmer

The best way to be a ‘good’ programmer is to know when the computer can help you and when it will just get in the way. A computer cannot ‘solve’ a problem for you, but it can help you to find the answer when you’ve told it what to look for and what rules to use in that search. A computer can only do exactly what you tell it to do, so if you don’t know what to do then the computer won’t either.

One of the founders of computing, Charles Babbage had this to say:

On two occasions I have been asked, — “Pray, Mr Babbage, if you put into the machine wrong figures, will the right answers come out?” In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. Passages from the Life of a Philosopher (1864), ch. 5 “Difference Engine No. 1”

Modern programmers call this: garbage in, garbage out. GIGO, for short.

The 3 virtues of a programmer

Another useful idea comes from Larry Wall (the man with the strong ’tache game below!), who created a programming language called Perl. Larry said that programmers had three virtues: Laziness, Hubris, and Impatience.

Some of the reasons that these are virtues in programming (but not in your studies!) are as follows:

  1. Laziness makes you want to put in the effort now to reduce the amount of effort you’ll have to put in later. It might be a lot of work to produce a map of one US State automatically using code, but once you’ve worked out how to do it for one state, then you’ve also figured out how to do it for all 50! That is useful laziness.
  2. Hubris makes you want to write code that other people won’t want to “say bad things about”. Over time you’ll come to understand more intuitively what makes ‘good’ code in more detail (and we cover it a bit in the last lesson), but the short version is: it’s efficient, it’s easy to read, and it’s clever.
  3. Impatience is about wanting the answer now and looking for ways to get there as quickly as possible. Being impatient doesn’t mean just jumping into writing code, it means that you first look too see whether and how other people have solved similar problems before starting work on your own code. Rather than reinventing the wheel, we try to stand on the shoulder of giants.

Hint: you’ll see a lot of laziness when you start trying to write code. Programmers don’t like writing remove when they could just write rm, nor do they like writing define when they could just write def. Keep an eye out for these mnemonics as they can be pretty daunting at first.

The 3 false virtues

Larry also pointed out that these virtues had three mirror-image false virtues:

  1. False laziness happens when you leave something working but half-finished and, most likely, about to break. When you start using StackOverflow you may find that it makes it easy to copy+paste answers into your script and then you can glue it together messily. This isn’t the same as understanding and adapting the solution that you found online to your problem, so it’s false laziness. To really develop a learning mindset, don’t copy+paste code, type it out.
  2. False hubris is thinking that no one else’s code is ‘good enough’ for you. Sometimes copy+paste is false laziness, but refusing to recognise when copy+paste (or importing a library, more on this later) is the right thing to do is false hubris.
  3. False impatience is getting started on coding your answer to a problem when you don’t yet understand what the problem actually is. One thing that a lot of programmers do is half-listen to what someone has asked them to do and then go haring off without sitting down to make any kind of plan. It’s like writing an essay without having done the readings. Nudge, nudge.

There’s a lot more thinking on this here.

Learning a (New) Language

The single most important thing that you can learn is how to frame your problem in a way that you can communicate to a computer. Crucially, the real power of the computer isn’t figuring out how to tell it to add 1, 2, 3, 4 together and calculate the mean (average), it’s figuring out how to tell it to add any possible set of numbers together and work out the mean. It’s when you start to see ways to do this for sets of hard problems (for you) that you know you’ve started to understand the language.

So these lessons are intended to help you start learning a new language – a programming language – but you should always remember this: you’re not stupid if you don’t know how to explain things to the computer so that it can help you find the answer. You’re learning the basics of how to communicate with computers; there are only two things that are silly: the first is expecting to be able to run before you can walk; the second is copying and pasting answers without trying to understand why they are answers.

Actually, there is a third silly thing: not asking for help. In the same way that learning a new human language is easier if you immerse yourself in the culture and make friends with some native speakers, learning a new computer lanugage is easier if you immerse yourself in the culture of computing and make friends with some more proficient speakers. But just as your French- or Chinese-speaking friends will get tried of answering your questions if you don’t make it obvious that you’re working hard to learn, most programmers won’t be very impressed if you just ask for ‘the answer’ and then come back two days later with the same question.

Mathematics is a Language, so is Code

There are obviously many ways that you can calculate the mean (also known as the average if your maths is a little rusty): in your head, using pencil and paper, on a calculator, in Excel… and, of course, using code! For a small set of simple numbers, using your brain is going to be a lot faster than typing it into a calculator or computer.

We can use Python like a calculator. For example, to calculate 2 plus 2 using Python we can just type 2 + 2 ↩︎ and get the answer. OK, but where do we run this?

Tip

Now is when you either start Python on your own computer (usually from the Terminal/Command Prompt) or mosey on over to Google Colab (to run Python in the cloud). If you don’t know how to do either, please visit the install and no install pages.

All of the rules of normal maths are respected by Python, though sometimes the syntax is a little different (as we’ll see later). So the following code will also work as expected to produce 418.0:

Tip

Notice the ‘clipboard’ icon at the right-hand side of the code block above this tip? That will copy the code to your computer’s memory, allowing you to paste it into another file/input, such as the Python interpreter or Google notebook so that you can save or run it.

Quick, what’s the average (mean) of: 1 + 2 + 3 + 4?

A little harder, huh? So type your ‘formula’ to calculate the mean of 1, 2, 3, 4 then either hit Return (if you’re working in the Terminal) or click the ‘play’ button on the tool bar at the top of the window (if you’re working in Google Colab) to run your first piece of Python code!

Your equation should include the four numbers above with some + symbols and ( and ), a / symbol, and another number.

If you are totally at a loss for what to type in the code cell, just click the “Answer” tab.

(1 + 2 + 3 + 4) / 4 
2.5

If you got the result 2, you are using Python version 2 and will need to check (and change) your installation of Python (go back to the Python Setup page.

Learning Languages is Hard

This is really important: just because you told the computer to do something that you eventually realise was ‘wrong’ (you may even feel silly) does not mean that you are stupid.

Did you ever try to learn a foreign language? Did you expect to be fluent after a couple of classes? Did you accidentally tell your teacher that you were ‘pregnant’ when you were just ‘embarrassed’? Assuming that you had a realistic expectation of how far you’d get with French, Chinese, or English in your first couple of years, then you probably figured it’d be a while before you could hold a conversation with someone else.

So why would you expect to sit down at a computer and be able to hold a conversation with it (which is another way of thinking about what coding is) after reading a few pages of text and watching a YouTube video or two? You will need to give it time, you will need to get used to looking at the documentation, you will need to ask for help (this seems like a good time to introduce Stack Overflow), and you will need to persevere.

If you don’t have a reason for learning to code outside of trying making lots of money that’s not a very long term passion… but when you have an idea or a problem that you’re passionate about solving then that’s why we keep on going… but do you need to have an understanding of complex math or logic skills, the answer is no.

So while ‘making money’ is (often) a nice outcome of learning to code, having a passion for what you want to do with your code is what’s going to get you through the worst parts of the learning curve. You also need to be realistic: to become a professional programmer is something that happens over many years, you probably won’t just take a couple of classes and then go out into the world saying “I’m a programmer.”

And, no, you do not need to know advanced maths in order to learn how to code: you need to be able to think logically and to reframe your problems in ways that align with the computer.

On Accessible Code

In the early days of computing, programs weren’t even written in English (or any other human language), they were written in Assembly/Machine Code. One of the people who thought that was crazy was this rather impressive Rear Admiral:

Grace Hopper felt that applications should be written in a way that more people could understand; this would be good for the military, but it would also be good for business and society in general. For her efforts, she is now known as the Mother of COBOL (the COmmon Business Oriented Language), a language that is still in (some) use today.

The Open Source Ethos

Once I’ve got a solution to my current problem, I can take that code and apply it to a new problem. Or a new case study. Or, I can post it online and let others build off of my work to tackle problems that I’ve not even considered! Giving away my code might seem like a bad idea, but think about this: in a world of exciting research questions, are you going to be able to tackle every single one? Your own work already builds off of code that other people gave away (the Mac OS, Linux, QGIS, Python, etc.)… perhaps you should give something back to the community? Not just because it’s a nice thing to do, but because people will find out about you through your code. And those people might be in a position to offer you a job, or they might approach you as a collaborator, or they might point someone else with an interesting opportunity in your direction because you have built a reputation as a ‘contributor’. It has happened to us.

Practice makes perfect

Your language class (assuming that you took one) probably had a ‘lab’ where you practised your new language and you probably made a lot of mistakes when you were getting started. It’s the same for programming: the reason you got a ‘silly’ answer is that we haven’t taught you how to ask the right question yet! For a language like Python “4” is not always the same as “4.0”… and sometimes the way ahead is unclear, but don’t worry if you can’t get the right answer yet, how to ‘talk numbers’ is the main topic in the next lesson and we’ll show you the answer at the end of this lesson.

So, we want you to remember that there are no stupid questions when it comes to programming. We have all been lost at one point or another! Even skilled programmers (which doesn’t, TBH, describe many academics) frequently still ask for help by asking Google, it’s just that we do it for harder questions. And that’s only because we have had a lot more practice in the language of programming. So the only silly thing you can do in this short course is to assume that you can speed through the questions and don’t have to practice.

Actually, there’s one other silly thing that you can do, and that’s not asking for help at all. If you’ve been banging your head against the computer for twenty minutes trying to get something to work and it just isn’t working then try explaining it to a friend or even a pet or stuffed animal! Often, the process of talking something out helps you to find the answer for yourself.

If you don’t expend any effort in trying to understand how the code works, or if you just copy the answer off of your friend, that’s the same as assuming you’ll learn a foreign language just because you’re sitting next to a friend who is taking the same language course! That is also silly.

Warning

You will need to practice in order to progress. You don’t learn French or Chinese by practising in the language lab once a week. You won’t learn to program a computer by only practising once a week.

When computers beat brains (or calculators)

What makes a computer potentially better than a calculator (or your brain) is that a computer isn’t daunted by having to count lots of numbers and it doesn’t need you to input each number individually! The computer can also do things like:

  • Find out the amount of rain that fell in London, Manchester, and Edinburgh yesterday from an online weather service;
  • Work out the average rainfall for these three cities; and then
  • Work out the standard deviation for rainfall.

And it can do all of this in a matter of milliseconds! It can also do the same for 3,000 cities just as easily; sure, it’ll take a little bit longer, but it’s the same basic code.

In other words, code is scalable in a way that brains and calculators are not and that is a crucial difference.

Here’s a trivial example of when computers start to get better and faster than brains:

Remember

Copy (clicking on the clipboard icon on the right) and paste the code into whatever Python environment you are using and then ‘run’ the code to get the answer.

Programming in Python

In these lessons we will be using the Python programming language. As with human languages, there are many programming languages in the world, each with their own advantages and disadvantages, and each with their own vocabulary (allowed words) and grammar (syntax). We use Python.

Python vs R

Alongside Python, the other language that is often mentioned by people doing data-led research is R. It’s the other one that many of your lecturers and a lot of other scientists use in a lot of their work. There’s a great deal of debate about the relative merits of Python and R, but for our purposes, both Python and R can help us to undertake the geographical analysis. That is, in fact, the premise of this entire course!

So why have we chosen to use Python here? Of the two languages, we think that Python has some specific advantages:

  1. It was designed for teaching, so its syntax is easier for a human to ‘parse’ than R’s
  2. It is more like other languages, so it is more readily transferrable if you need to learn another language. Think of it as learning Italian, which also makes it easier to learn Spanish and French.
  3. It is the one most-used as part of a geographical workflow – what we mean by this is that you can find Python buried inside of ESRI’s ArcGIS and the open-source QGIS applications, and it also sits behind (or talks to) a number of other tools that allow us to work flexibly and scalably with geo-data.
  4. It is easier to operationalise – Python offers more services/tools to enable you to turn something from a ‘hack’ (which doesn’t mean what you think it means into a ‘service’.

However, if you have been told R is the way to go then don’t worry, the concepts covered here still translate. And many of the contributors to these lessons use both languages… it just depends on the problem.

Python what?

Python was invented in the late 1980s by Guido van Rossum, Python’s ‘benevolent dictator’ (until he stepped down from the position on 12 July 2018), which means that he (and some other very smart people) try to ensure that the language continues to meet the basic goals of:

  • Being very easy to read (syntax)
  • Using plain-English for many functions and operators (allowed words)
  • Has a comprehensive style guide: PEP8 (syntax)
  • Has no unnecessary special formatting characters (syntax and allowed words)

So while Python is not a language that enables the computer to make calculations the fastest (C and C++ are faster), nor is it the safest (you wouldn’t use it to fly a rocket to Mars), it is a very readable, learnable and maintainable language.

So if you want to learn to code, to do ‘data science’, or build a business, Python is a great choice.

The points above are also made in Python In A Nutshell by Martin Brochhaus which you may find interesting and useful to accompany your learning of Python.

Three takes on Python

The images below are links to three videos pitched in quite different ways at the advantages of Python, all of which touch on issues we’ll be dealing with later… so watch the videos (even if they’re a bit silly in places)!

Further Reading

  • A must read: The Hard Way is Easier
  • Two easy and accessible videos to start wrapping your head around programming (although they are not Python-centric) 1 and 2

Credits!

Contributors:

The following individuals have contributed to these teaching materials:

License

The content and structure of this teaching project itself is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license, and the contributing source code is licensed under The MIT License.

Acknowledgements:

Supported by the Royal Geographical Society (with the Institute of British Geographers) with a Ray Y Gildea Jr Award.

Potential Dependencies:

This lesson may depend on the following libraries: None