About this Course

78,617 recent views
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Beginner Level

You will need mathematical and statistical knowledge and skills at least at high-school level.

Approx. 29 hours to complete
English

What you will learn

  • Define and explain the key concepts of data clustering

  • Demonstrate understanding of the key constructs and features of the Python language.

  • Implement in Python the principle steps of the K-means algorithm.

  • Design and execute a whole data clustering workflow and interpret the outputs.

Skills you will gain

K-Means ClusteringMachine LearningProgramming in Python
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Beginner Level

You will need mathematical and statistical knowledge and skills at least at high-school level.

Approx. 29 hours to complete
English

Offered by

Placeholder

University of London

Placeholder

Goldsmiths, University of London

Syllabus - What you will learn from this course

Content RatingThumbs Up94%(5,396 ratings)Info
Week
1

Week 1

7 hours to complete

Week 1: Foundations of Data Science: K-Means Clustering in Python

7 hours to complete
9 videos (Total 22 min)
9 videos
Introduction to Data Science2m
What is Data?1m
Types of Data1m
Machine Learning3m
Supervised vs Unsupervised Learning2m
K-Means Clustering4m
Preparing your Data1m
A Real World Dataset53s
4 practice exercises
Types of Data – Review Information15m
Supervised vs Unsupervised – Review Information15m
K-Means Clustering – Review Information30m
Week 1 Summative Assessment40m
Week
2

Week 2

4 hours to complete

Week 2: Means and Deviations in Mathematics and Python

4 hours to complete
11 videos (Total 37 min), 4 readings, 11 quizzes
11 videos
2.1 – Introduction to Mathematical Concepts of Data Clustering1m
2.2 – Mean of One Dimensional Lists2m
2.3 – Variance and Standard Deviation3m
2.4 Jupyter Notebooks6m
2.5 Variables4m
2.6 Lists4m
2.7 Computing the Mean3m
2.8 Better Lists: NumPy3m
2.9 Computing the Standard Deviation6m
Week 2 Conclusion31s
4 readings
Population vs Sample, Bias10m
Variability, Standard Deviation and Bias10m
Python Style Guide10m
Numpy and Array Creation20m
10 practice exercises
Population vs Sample – Review Information5m
Mean of One Dimensional Lists – Review Information3m
Variance and Standard Deviation – Review Information4m
Jupyter Notebooks – Review Information20m
Variables – Review Information10m
Lists – Review Information10m
Computing the Mean – Review Information10m
Better Lists – Review Information10m
Computing the Standard Deviation – Review Information10m
Week 2 Summative Assessment40m
Week
3

Week 3

8 hours to complete

Week 3: Moving from One to Two Dimensional Data

8 hours to complete
16 videos (Total 53 min), 10 readings, 15 quizzes
16 videos
3.1 Multidimensional Data Points and Features2m
3.2 Multidimensional Mean2m
3.3 Dispersion: Multidimensional Variables3m
3.4 Distance Metrics5m
3.5 Normalisation1m
3.6 Outliers1m
3.7 Basic Plotting2m
3.7a Storing 2D Coordinates in a Single Data Structure6m
3.8 Multidimensional Mean4m
3.9 Adding Graphical Overlays5m
3.10 Calculating the Distance to the Mean3m
3.11 List Comprehension3m
3.12 Normalisation in Python5m
3.13 Outliers and Plotting Normalised Data2m
Week 3 Conclusion30s
10 readings
Multidimensional Data Points and Features Recap10m
Multidimensional Mean Recap10m
Multidimensional Variables Recap10m
Distance Metrics Recap10m
Normalisation Recap10m
Note on Matplotlib10m
Matplotlib Scatter Plot Documentation20m
Matplotlib Patches Documentation10m
List Comprehension Documentation20m
3.12 Errata10m
15 practice exercises
Multidimensional Data Points and Features – Review Information3m
Multidimensional Mean – Review Information3m
Dispersion: Multidimensional Variables – Review Information5m
Distance Metrics – Review Information6m
Normalisation – Review Information3m
Outliers – Review Information30m
Basic Plotting – Review Information5m
Storing 2D Coordinates – Review Information30m
Multidimensional Mean – Review Information30m
Adding Graphical Overlays – Review Information30m
Calculating Distance – Review Information30m
List Comprehension – Review Information30m
Normalisation in Python – Review Information30m
Outliers – Review Information30m
Week 3 Summative Assessment25m
Week
4

Week 4

4 hours to complete

Week 4: Introducing Pandas and Using K-Means to Analyse Data

4 hours to complete
8 videos (Total 37 min), 6 readings, 8 quizzes
8 videos
4.1: Using the Pandas Library to Read csv Files5m
4.1a: Sorting and Filtering Data Using Pandas8m
4.1b: Labelling Points on a Graph4m
4.1c: Labelling all the Points on a Graph3m
4.2: Eyeballing the Data5m
4.3: Using K-Means to Interpret the Data8m
Week 4: Conclusion35s
6 readings
Week 4 Code Resources5m
Pandas Read_CSV Function15m
More Pandas Library Documentation10m
The Pyplot Text Function10m
For Loops in Python10m
Documentation for sklearn.cluster.KMeans10m
7 practice exercises
Using the Pandas Library to Read csv Files – Review Information5m
Sorting and Filtering Data Using Pandas – Review Information10m
Labelling Points on a Graph – Review Information5m
Labelling all the Points on a Graph – Review Information5m
Eyeballing the Data – Review Information5m
Using K-Means to Interpret the Data – Review Information5m
Week 4 Summative Assessment40m

Reviews

TOP REVIEWS FROM FOUNDATIONS OF DATA SCIENCE: K-MEANS CLUSTERING IN PYTHON

View all reviews

Frequently Asked Questions

More questions? Visit the Learner Help Center.