About the course

Various notes and policies for the course.

Prior background

Data science uses code to analyze messy data. We aim to teach you that in the course. To do the course, you do not need any previous experience of using code or computing programming. We expect a large proportion of the students will be starting from scratch.

One advantage of data science methods is that we can use code to express algorithms that the logical steps of the analysis. It turns out that expressing our analysis in this way makes it possible to avoid much of the mathematics in data analysis and statistics. We will use some simple mathematics in the course, but nothing beyond what you have learned in your secondary school / GCSE maths.

Mixtures of experience

You do not need to have done any programming before, to take this course, but some of you will have some programming experience. If that is you, you will find that you can go quickly through the early material in the course. Your work, in those early parts of the course, are to help the others to learn. In doing so, we think you will find that you get deeper understanding of programming, and of the material.

If you do not have any previous experience of programming, you will find that you need more time to learn the ideas and let them sink in. Learning a programming language is like learning any language, it takes time, concentration and repetition. Keep trying things that are a little bit harder than you think you can manage. It will be frustrating at first, but you will learn the value of frustration.

In a word, be patient.

Mixtures of background

Data science is a subject that crosses all disciplines. This can be exciting, as we expand our interests and work out how to make sense of many kinds of data. It is also a challenge that needs patience. As a data scientist, you will need to listen carefully to understand unfamiliar ways of thinking, and problems you had not thought about before.

In a word, be patient.

Grading

Please do not worry about the grades.

Here is a short pause.

Please do not worry about the grades.

Your job, in this course, is to learn to think straight, about what can and cannot be learned from data. Along the way, we hope that you will find that you are exploring interesting data and finding answers to interesting questions. If you have fun, as we hope you do, you will do well on the course.

After every class, there will be an exercise for you to do. Please use these to find out how well you are doing, and to help us find out what we have taught you, and what we have failed to teach, so we can do better in the next class.

There are only two graded assessments. These are:

  1. A structured data analysis due at end of first semester: (30% of the course mark). More details will follow.
  2. A group data analysis project due at end of second semester, on data chosen by the group. All contributions will have been publicly recorded using standard version control / issue tracking. Students submit an explanation of their own role with evidence from these public records. Project report of 5000 words: (70% of the course mark). More details will follow.

Working together

You will find that data science involves a lot of collaboration. Programmers teach non-programmers, people who know the data problem teach those who do not know, sceptics teach optimists, and optimists teach sceptics.

Please feel free to work with each other on the questions after each class. You will learn more that way. Be careful not to coast, otherwise you will fall behind, and the learning will start to get harder and harder. You will need to work alone for the structured data analysis above.