0:08

So, if we create stacked graph layouts,

Â stacked bar charts, and stacked line graphs of a lot of dependent variables,

Â then you get some interesting effects as displayed here.

Â You start to see kind of a stream effect, and you can see the same

Â data variable evolving as it moves across the horizontal axis.

Â But you also see some other problems with this representation.

Â One is that the variants in the lower variables

Â in the stack can influence the shape of the higher variables.

Â And it can be harder to see when things get sheared towards the outsides,

Â whether they're representing the variables the same value or if it's increasing or

Â decreasing.

Â And also you can see that by the time you get to the very top of the graph,

Â it's changing quite violently because it's the sum of all the changes that have

Â happened below it.

Â 1:05

We can analyze this by using a few variables.

Â If we let each variable at a given horizontal position on the horizontal

Â axis, each variable be represented by say, f1 for the first variable,

Â f2 for the second variable And so on, f1, f2, f3.

Â Then we can define, basically, ground zero, g0, to be the baseline level.

Â And in this case, we're just setting g0 to be 0, just at the horizontal axis.

Â Then g i is the position of the top of the plot of the i'th variable.

Â And the top of the plot of the i'th variable g i is just equal to g0

Â plus that variable and all the variables below it in the stack.

Â 1:51

And because of this analysis, there is an alternative layout called ThemeRiver.

Â And ThemeRiver basically centered the vertical plot,

Â the stack of variables along the horizontal axis.

Â Set it set g0 to be one-half of the total height of the stack of variables.

Â And by doing that, it basically said that the way this thing is varying at the top

Â will be a mirror image of the way this thing is varying at the bottom.

Â And you get more of a, the appearance of a river that data is kind of streaming by

Â and evolving and it reduced the amount of shearing that was happening,

Â but it didn't eliminate it completely.

Â You can see for example, this region right here, is creating a large

Â 2:42

shift in the data around it that you'd like to be able to minimize.

Â So just by centering the vertical stack of data around the horizontal axis

Â minimizes the heighth of the chart around

Â that horizontal axis and it also minimizes the slope at the top and the bottom.

Â There's a Streamgraph layout that does an even better job of this and

Â it does this just by changing where the position g0 is.

Â Streamgraph sets g0 equal to the result of this formula.

Â So you just evaluate this formula based on your data values f i,

Â and we won't go through the derivation of this formula.

Â But by just changing where the base of this stacked bar chart or

Â stacked line chart occurs and then stacking based to that new baseline,

Â we get an even smoother appearance and it makes it even easier for

Â us to make comparisons as we move horizontally across this chart,

Â to determine relative changes to each of these dependent variables.

Â It minimizes the deviation and the wiggle, the deviation being how far

Â a variable's plot moves from its previous position on the horizontal axis,

Â and the wiggle is minimizing the slope, basically the sheer effect

Â that you get from the wiggle.

Â You can also improve the appearance by changing the order in which you

Â add variables.

Â So, if variables are zero until a certain position on the horizontal axis,

Â and then they change from zero to some value,

Â you can change where they appear in the stack of variables.

Â And so in this case, they're just stacked in some fixed ordering,

Â some arbitrary ordering.

Â And you can see kind of a stream in the coloring.

Â Coloring each variable differently helps perceive that stream.

Â 5:01

But you can also add new variables

Â when the variables take on a value other than zero.

Â You can always add them to the outside of the graph.

Â And what that does, is it takes variables that start out

Â at a certain point on the horizontal axis, as soon as they take on the nonzero value,

Â you add them but you add them to the outside, so

Â their initial surge in value doesn't disturb the other variables.

Â It's happening on the outside of the graph.

Â And then as it's waning, other variables

Â are taking non-zero values and being added to the outside of the graph.

Â And so you get this nice, flowing appearance where you can see the relative

Â change in values of these variables as you move right on the horizontal axis,

Â even though we're looking at 10 or 20 variables in a given stack, vertically.

Â So we learned how to make a stream graph, and

Â how something as simple as stacking a bar chart can become

Â a point of further investigation in order to how to display it effectively.

Â We also learned that, for example,

Â a pie chart can become misleading especially when shown in three dimensions.

Â [MUSIC]

Â