Science is undergoing a data explosion, and astronomy is leading the way. Modern telescopes produce terabytes of data per observation, and the simulations required to model our observable Universe push supercomputers to their limits. To analyse this data scientists need to be able to think computationally to solve problems. In this course you will investigate the challenges of working with large datasets: how to implement algorithms that work; how to use databases to manage your data; and how to learn from your data with machine learning tools. The focus is on practical skills - all the activities will be done in Python 3, a modern programming language used throughout astronomy.
Offered By
Data-driven Astronomy
The University of SydneyAbout this Course
Skills you will gain
- Python Programming
- Machine Learning
- Applied Machine Learning
- SQL
Offered by

The University of Sydney
Our excellence in research and teaching makes the University of Sydney one of the top universities in Australia and highly ranked among the best universities in the world. In 2020, we were ranked second in the Times Higher Education (THE) University Impact Rankings, and first in Australia in the QS Graduate Employability Rankings.
Syllabus - What you will learn from this course
Thinking about data
This module introduces the idea of computational thinking, and how big data can make simple problems quite challenging to solve. We use the example of calculating the median and mean stack of a set of radio astronomy images to illustrate some of the issues you encounter when working with large datasets.
Big data makes things slow
In this module we explore the idea of scaling your code. Some algorithms scale well as your dataset increases, but others become impossibly slow. We look at some of the reason for this, and use the example of cross-matching astronomical catalogues to demonstrate what kind of improvements you can make.
Querying your data
Most large astronomy projects use databases to manage their data. In this module we introduce SQL - the language most commonly used to query databases. We use SQL to query the NASA Exoplanet database and investigate the habitability of planets in other solar systems.
Managing your data
This module introduces the basic principles of setting up databases. We look at how to set up new tables, and then how to combine Python and SQL to get the best out of both approaches. We use these tools to explore the life of stars in a stellar cluster.
Reviews
- 5 stars84.76%
- 4 stars13.55%
- 3 stars1.17%
- 2 stars0.25%
- 1 star0.25%
TOP REVIEWS FROM DATA-DRIVEN ASTRONOMY
Excellent presentation and delivery method by the instructors and very good course content. Highly recommend it anyone curious about Universe and to understand it better!
This is a well set course. I have completed one week and I loved blend of maths, astronomy and tools!Course content is not outdated, which is really important for a field like this.
This course is exceptionally good, well developed and structured. The content of the course is good. The teachers have demonstrated the concept well. I would like to learn more on this concept.
This is a great course for anyone wanting to do data science with astronomical datasets. The lectures are clear and interesting and the activities are well structured. I really enjoyed this course!
Frequently Asked Questions
When will I have access to the lectures and assignments?
What will I get if I purchase the Certificate?
Is financial aid available?
What programming background is assumed?
More questions? Visit the Learner Help Center.