0:11
Hello, this lesson is going to introduce you to the techniques
to actually construct a simple visualization in Python.
We'll be doing this primarily by using the Matplotlib module, but
we'll also use the Seaborn module, which is a second visualization module.
And can be used to easily improve the appearance of a Python plot that was
constructed by using Matplotlib, so we often use these two together.
Maplotlib is a very powerful package.
It enables you to do about anything you could imagine with a Python
script that wants to make a visualization.
But that complexity can also make it difficult to understand how to do things.
1:24
If you are running and trying to make visualizations, and
you can't see them make sure this line is in your notebook at the start.
Second thing is, we have these two lines here,
these suppress warnings that sometimes occur when we're making visualizations.
For instance, there might be fonts that are trying to be used, and
they're not appearing.
We don't want to know about those.
They don't affect the creation of the plot.
They just simply are warning messages saying, we had to use a different font.
1:51
Now, many Python modules, such as Pandas, include built-in plotting functionality.
And I'm not trying to say these aren't important, you often will want to
make a quick plot and just see does this look like it's worthwhile pursuing or not.
And you can use these as demonstrated here where we take the total_bill and
tips columns out of tips data frame and we simply plot a histogram.
And you can see here is the result.
That's actually quite impressive if you think about it.
We had one line of code to read the data in,
and the second line of code plotted the data.
This should demonstrate one of the reasons that learning to program and learning to
code is so important, because we very quickly got a simple visualization.
2:36
Now, what we want to do though is think about the overall
process of making a figure.
When we want to make a plot, and we use Matplotlib,
we first need to import the library and this is the way we're going to do it.
We're going to import matplotlib.pyplot as plt.
Any time we want to reference something out of the Matplotlib module,
we then have to do plt.
So we want to create a figure what do we do?
Well, we say plt.figure, that creates a figure object.
We also need what is known as Axes object.
This is the actual thing that's going to hold our plot.
We can have multiple axes in a figure.
So we can either create a subplot from our figure or
we can make multiple plots, even if it's just one, by calling plt.subplots and
this returns both the figure and the axes.
We'll see later how to make multiple subplots in the same figure
in a subsequent lesson.
So once we've done this we can specify how big do we want our figure?
So we use the figsize attribute, and we pass in a tuple with the width and
height specified, traditionally this is in inches.
3:47
And we can then get our plots, so here we go, we are going to make our first plot.
We import, we then say our figure size is 10 by 5 and then we call plt.show.
Now, traditionally, we don't need this line.
But if we don't do it, it actually will create a variable
that we'll see and so rather than showing that, I'll do the plt.show.
For instance, I can change that and you'll see the result if I don't call it.
Now this is a pretty boring plot.
There's nothing in it.
It's just a big rectangle that's 10 inches by 5 inches,
with 0 to 1 laid out on both axes.
Let's start adding more information.
First, we can actually enter real data, so we're going to create data.
Here we have a simple linear equation, the slope is m, the intercept is b,
and we're going to linearly space data between 0 and 10.
We're then going to plot it and all we do is passed in our two data sets x and y and
we get a nice line.
And you could see that's pretty nice.
Let's actually label are axes though, right?
And we can also label our title.
So, if we add this calls in, you could see that we get our axis.
And we're going to say the x labels for this particular axis is X Axis,
y is Y Axis, and our title is Our First Plot.
And there you go, we've now decorated our plot.
Now if we had multiple plots shown in the same figure,
we would have to change the axis to represent that.
It's a little bit confusing at times to think that Matplotlib calls the subplot
itself an axes even though that subplot has its own x and y axis.
But after awhile working with it enough, you'll get the idea.
We can also control the range over which our x axis is displayed.
And we could also control the labels that are shown here on the x axis.
We do that with set_xlim and set_ylim.
So here we're going from -2 to 12 and y will be -20 to 10.
And we could also specify where should the labels appear on our x axis and
where should they appear on our y axis.
Notice that we're using the num p arange method here, which is going to go 0,
10, 0, 5, 10.
And this one will go -15, -10, -5, 0, and 5.
And so if you look at it, that's exactly what's shown.
6:09
Now you can play with this and try it out to make your own plot and
you can change the labels as indicated here.
What if we want to do multiple functions?
We can display multiple data on the same plot
by simply calling the function on the same axis object.
If we do that, we can get two lines.
6:28
We can also save our plot.
If you want to save it as a PDF,
you simply end the name of your file in PDF and that's what will be created.
If you use PNG, it will create a PNG, which is a different type of format.
We can create this file, and then we will execute.
Notice, we've saved our file, it still shows up in our notebook.
But then when we look, we can see that the file is there.
So these are two UNIX commands to list the contents.
Is that file exists?
There it is.
And then we remove it, so that our notebook is clean and
we can re-create it later.
The last thing in this notebook is how to use the Seaborn library to make
things prettier.
Make the visualization look better and focus on the important aspects.
7:14
The first two lessons in this module actually
talked about this idea of making things visually appealing and intuitive.
These ideas build on some of the concepts introduced
by Edward Tufte into the visualization landscape.
And the Seaborn library does a nice job of doing this, making this easy.
It includes lots of module functions that can generate very impressive plots and
we'll see those in subsequent lessons.
But we can actually simply import Seaborn and
call the set method and our plots will now look much better.
Notice how the colors are better looking, and it also has laid out these ticks so
that we can quickly see what the values of our data.
For instance, here you see that it is 5, -5 and this is 10, -15.
These sorts of visual clues can be useful to help improve the understanding of
your plot.
There's some other things you can do with Seaborn and
this rest of the notebook demonstrates it.
Here's another way of demonstrating it.
We can make our labels bigger or fonts, change of fonts, etc.
I encourage you to play around with these things and see how they work.
Your time will be well worth it.
If you have any questions, let us know, and good luck.