[MUSIC] Hi, welcome to our MOOC on data-driven astronomy. I'm Tara Murphy from the University of Sydney where I do research in radio astronomy and teach computational physics. I'll discuss the structure of the course in later videos. But for now, let's dive right in. Since the dawn of time people have looked to the sky to ask big questions about our existence. Who are we? What causes the seasons? How did the Earth come to be? As astronomy became a science we were able to start asking quantitative questions and making high precision measurements. Today we understand so much more about the universe and yet there are still huge unanswered questions. Such as, how did the first stars and galaxies form? What is the nature of dark matter? And is there life on other planets? Modern astronomers are trying to answer these, using some of the world's most incredible scientific instruments. But these instruments generate huge amounts of data up to petabytes per night. Likewise, the simulations required to reproduce our observable universe are complex. And they generate large and unwieldy datasets. Astronomy is well and truly in a data extensive era. This means the traditional ways we approached science in the past simply won't work anymore. Large and complex datasets make easy problems difficult to solve. And we need to think in a computational way when we're designing, conducting and analyzing our observations and simulations. Major instruments have dedicated super computing facilities, software engineers and data based administrators. So you won't have to take the petabytes of data straight off the square kilometer array and process it yourself. But even the process data presents significant challenges. Every individual scientist will still have to deal with what we can think of as medium sized data. Datasets that are big enough that manual methods won't work. Datasets that are complex enough that we'll need to change the way we think about solving problems. What exactly do I mean by that? I'd like to share a story with you that shows how having big data can make easy problems really, really difficult if you don't tackle them the right way. Every year I supervise undergraduate research students. And a couple of years ago one of my students was trying to find faint signals in pulsars below the noise level in individual images. We'll talk more about this later in the module. I suggested he stack the images together using averaging. Literally adding all the images to increase the signal to noise. He went off and implemented a nice little Python script to do this and presented his results in our next meeting. He then discussed his results with a colleague, who suggested that the medium is a more robust statistical measure than the mean. You probably learned this in high school but if not we've got some resources that explain why. So my student happily went off to modify his code to calculate the median rather than the mean, tested it on a small dataset and so far so good. The next I heard was an email from our system administrator asking why one of my students had brought our entire cluster to a standstill. And did I know what was going on? The system administrator then cancelled all the jobs so other people could continue working. But what had happened? How could something so simple turn into such a big problem? This is the kind of question we'll address in the first module of this MOOC. In this MOOC, we're going to show you how to work with data to answer questions about astronomy. The techniques apply to any scientific discipline. So this course is suitable for astronomers who want to learn more about computational approaches. Programmers who want to learn something about modern astronomy, students who want to improve their data analysis skills or coding skills. The aim is to help you become more efficient and effective in your technical domain using some exciting examples from astronomy. Over the full course, you'll learn how to manage datasets so that you can do science more effectively. How to automatically find patterns in your data using machine learning. And how to think about algorithms in smarter ways. On the way, we'll talk about data formats and coding best practice, with loads of practical examples. As well as covering a whole range of astronomy topics, from planets, to pulsars, to black holes. If that sounds interesting to you join us as we investigate how to do astronomy in the era of big data. [MUSIC]