0:00

[SOUND] So, data visualization can consist of some very simple charts,

Â but the success of a data visualization can often depend on how

Â we map our data variables to the elements of those charts.

Â So we can start with a Bar Chart.

Â And the bar chart has two axises typically.

Â You've got a horizontal axis and a vertical axis.

Â And you're usually measuring discrete values here, and

Â some either discrete or continuous value vertically.

Â And this benefits from the fact that you're mapping a variable,

Â a data variable, to both position, the actual height of these bars,

Â as well as to a length, the size of the bar.

Â And so you do a really good job of not only seeing.

Â That, for example, the orange bar is larger than the blue bar, but

Â how much larger the orange bar is to the blue bar because

Â position and length are both at the top of perceptual effectiveness for

Â displaying quantitative values.

Â And so usually vertically we have some sort of quantitative dependent variable.

Â And then horizontally these can be categories.

Â And so we have some nominal variable or at least some discreet variable here

Â indicating the individual bars that we're plotting.

Â And this is an independent variable, a dimension.

Â And then this is some kind of measure of that dimension.

Â It's a dependant variable depending on the value of this independent variable.

Â Similarly, you have a line chart.

Â A line chart has data points that are connected by a line.

Â And so this is very very similar to a bar chart.

Â These data points are at the same altitude as the tops of the bars.

Â So they benefit from position but they don't have the length.

Â That you visually see with the bars in a bar chart.

Â 2:00

So you still do a pretty good job of

Â being able to discern quantitative values and their relationship of quantitative

Â values in the altitudes of these data points in a line chart.

Â And so again we have a quantitative dependent variable vertically that's

Â 2:19

changing based on some quantitative independent variable horizontally.

Â But now the horizontal value is some quantitative continuous variable and

Â the vertical value also needs to be a quantitative continuous variable

Â because we're drawing lines between these data points and these lines imply

Â that there's a continuity of values between these data points and

Â these data points have a horizontal and a vertical component.

Â These lines have a horizontal and vertical component.

Â So you don't want to use a line chart to display data across categories because

Â that's implying that there's in between values in between these categories and

Â if they're nominal categories, if they're discreet,

Â then there should not be in between values.

Â Your visualization shouldn't imply that there's in between being values.

Â 3:08

If we remove the lines, we get a scatter plot.

Â And a scatter plot gives us some other flexibility.

Â When we display a line plot, we're displaying a function.

Â We're displaying some dependent variable

Â that's changing according to an independent variable.

Â So that there's one dependent value for every independent value.

Â 3:29

So there's basically one measure for each change in dimension.

Â When we do a scatter plot, we have two independent variables so

Â that I can have the same horizontal value here and

Â I can have two values associated with that and so that can be a powerful value.

Â You usually don't connect these with a line unless there is some order in

Â which the data is coming in that you want to associate with a line and

Â that would be an additional dimension you could indicate on a scatter plot.

Â But the line doesn't infer that you're plotting a function, because

Â a scatter plot doesn't plot a function unless the data's organized that way.

Â And so you have two independent variables, a horizontal independent variable and

Â a vertical independent variable.

Â And you're getting an indication of position, both, horizontally and

Â vertically, for the quantitative values on each of the two axes.

Â You also get some cues based on density if these points tend to cluster in certain

Â areas.

Â 4:47

And so in this case, we have two independent variables,

Â things that are no longer related as a function, but

Â you still get the benefits of position and length.

Â Gantt charts are usually processed diagrams that tell you

Â the various stages of a project.

Â And so horizontally a Gantt chart would usually be some display of time.

Â This may be a quarter, or date, or some other time axis.

Â And then vertically, this is some categorical, often a discrete or nominal

Â independent variable here vertically, and this is typically the tasks.

Â So you'll have the first tasks and then the second task, and

Â the second task may start before the first task finishes.

Â And tasks may stop and then start up again, and so you get this overlap.

Â Again, it benefits from both position and length.

Â but it operates from two independent variables.

Â Again one could be quantitative and one could be nominal similarly to a bar chart.

Â But in a bar chart you have one dependent variable

Â plotted over an independent variable.

Â In a Gantt chart you have to independent variables And, finally, you have a table.

Â In this case, you have two nominal variables, two categories,

Â for example, they're independent variables.

Â One doesn't depend on the other necessarily and

Â you're just looking at two separate dimensions, and

Â in plotting some value that would be the entry in each of these table entries.

Â So it really benefits from position only, and again, that position is discrete or

Â nominal.

Â 6:29

It's not a continuous position, as it would be in a scatter plot.

Â It's in discrete, quantized regions.

Â You might also notice if you look at this long enough,

Â you can see some flashing happening at the intersections.

Â And it's, again, important to remember your perceptual psychology

Â to know when you're laying these things out to pay attention to

Â contrast to make sure that you don't get some unwanted perceptual features.

Â 7:27

And your independent variable might be discreet or nominal, some category,

Â or it might be some some quantity tha varies continuously and

Â your dependent value could similarly be continuous or discrete, or an independent

Â variable could be a category, or it could be a continuously changing value.

Â Independent of your horizontal axis, and so depending on each

Â of these configurations you could look up in this table which you want to use.

Â If you have an independent variable and

Â a dependent variable, then most often you want to use a bar chart.

Â You can use a line chart, but only when you have a continuous dependent variable

Â and a continuous independent variable, because the lines indicate

Â that they're in between values both horizontally and vertically.

Â You want to use a Gantt chart if you have a independent variable.

Â That's continuous and a categorical axis vertically or

Â a categorical axis horizontally and a continuous value vertically.

Â Either one of those will form a Gantt chart.

Â 8:47

So we use the kind of data that we're trying to visualize nominal,

Â ordered, quantitative, whether it's continuous,

Â whether it's discreet, whether variables are dependent or independent.

Â To not only figure out how they map to chart elements, but

Â more importantly to decide which chart best displays them.

Â