[SOUND] [MUSIC] Hi, I'm Nicolas Fernandez. I'm a post-doc in Avi Ma'ayan's lab. And in this lecture I'll be discussing Visualizing Gene Expression Data using Interactive Clustergrams Built with D3.js. And in Part 1 I'll broadly discuss data visualizations and also the JavaScript library D3.js. So data visualizations can help drive discovery and facilitate communication. And the field of data visualization is growing in popularity. And because of this there are a large set of tools available to develop visualizations as well as web-based visualizations, and interactive visualizations also. So shown on the right here is a popular type of visualization and that is the visualization of a large network. So in this case we're viewing a network of Internet blogs as individual nodes here, these little dots. And we can see how these blogs are related to one another, connected to one another, using these thin lines, these links here. So this is a very large network with a very complicated structure. And what we can see from this visualization is how some of these blogs form large clusters ,and to get some idea of the degree of interconnectivity between these blogs, and sort of some smaller structure within this larger complicated structure. And of course these data visualizations make it pretty clear that, allow you to see the things would not be able to see by just looking at the data, and like spreadsheets for instance. So web-based visualizations have the advantages of ease of use and interactivity as well as other advantages. And so of the tools that are available to build web visualizations, D3.js is probably the most popular tool. And D3 stands for data driven documents and it is a JavaScript library that allows you to manipulate web pages using data. So it's a very popular, not only visualization tool, but just also a popular piece of open source software. So if you search GitHub for the pieces of open source software that have the most user-generated stars, you'll see that D3 is the fourth most popular repo based on user-generated stars. And this is out of over a million repos and hundreds of thousands of JavaScript and Python, Java, other types of languages that are out here. So D3 is a very popular piece of code and it's used in a lot of places. Actually this graph you see here I believe is D3 and also some of the more interactive features from GitHub itself, like where you can see the number of commits a user has made over a given year. This visualization here is actually built with D3. So you'll find a lot of D3 on the Internet. And so if you go to d3js.org quickly, you can see the huge variety of visualizations that can be made with D3. So it's not only limited to bar graphs or networks. You can also make all sorts of complicated visualizations and often times they'll find the code right here to show you how to actually build this visualization. And so D3 was developed by Mike Bostock who worked for the New York Times. So a lot of these links will take you to actual New York Times articles where you can see how these visualizations are being used in journalism. So one interesting example that shows you some of the unique advantages of web-based visualizations is this visualization of the wealth and health of nations. So this is a scatter plot visualization that changes over time. So on the y axis you have life expectancy of a country. The x axis is income per capita and the radius to the circle shows you the population of the country. And I believe the geographic location is determines the color code of each of these circles. So you can see how the different properties of these nations change over time. And what's interesting is if you hover over the year you can actually interact with the visualization and look at important points in time like World War II for instance, and see how this affects life expectancy and income and also population. So these types of visualizations, by adding in a layer of inner activity, you're able to really allow the user to not only see one view of the data but to see a view that changes over time or changes with their interaction. And this gives you an idea of how much you can customize a visualization with D3. So this is one of the things that makes it so popular. Okay so D3.js is a, like I said before, a JavaScript library that uses data to build web documents. So D3.js is primarily designed for visualizing data with SVG. Which is scalable vector graphics. But it's can more generally work on any component of the web page. So the dom elements of a web page. So on the right here is a quick snippet of code from one of the tutorials of D3, where it shows you the code required. And gives you a little bit of an explanation of how to create circles whose radius depends on the data given in this array. And I'll go into more detail how some of these components like selections and data binding work. So D3 itself is a pretty low-level visualization language in the sense that you build your visualizations from very simple components like lines and circles, which makes it highly customizable and interactive, but also, relatively difficult to make basic visualizations such as a bar graph. So, in order to show you how, give you some examples of how D3 works and show you some of the basic things that you need to know in order to build a visualization with D3 and to get over the hurdle of difficulties that are involved in building a website, I'll show you some examples on a website called JSFiddle. So JSFiddle, and there's a screenshot of it here, is a website that allows you to experiment with HTML, JavaScript and CSS. So JSFiddle consists of sort of toolbar here and four panels. So the four panels are the HTML, the CSS, and JavaScript, which are the three components required to build a webpage. And your finished product is shown here on the right. And in this sort of JSFiddle example I'll be showing you I'll be covering selections, data binding, and transitions. So, I went ahead and set up a JSFiddle on my JSFiddle account and it's pretty much a run of the mill JSFiddle. The only difference is that I added one external resource which is D3min.js which you can find from the D3.js website. So I effectively just copied this link here to where D3 is being hosted on cloudflare.com. So you could log onto JSFiddle and fork this if you want to use your own, or you could just create one of these on your own. So just to show you quickly how you can build a very simple webpage that just contains a simple paragraph, you just hit p, tab, and it writes out a paragraph tag for you. Then you can write here. And hit Run. And now you get a paragraph here. So, basically, this website allows you to sort of toy around with webpages. So I'll go ahead and leave that there. Okay. So, to start with, I'll discuss D3 selections. So, selections are used to work with objects on the webpage. And selections are used to tell D3 what components of the webpage you want to work on. And in this case, we're going to use a selection to change a text of a paragraph. So, here are the first paragraph p selected and we're making text of the paragraph new text. So, go ahead and view this here. So we have this original paragraph we run it, we see the text here. But, now we can use D3 to select p and, .text('new text'). Now we hit Run, D3 should change the text from here to new text, which it does. So, that's the first step to anything you do. You have to tell D3 what you want to work on. So, you do that with a selection. And there's a really good selection tutorial by Mike Bostock called How Selections Work that you can find pretty easily online. So, the next big step is to use data in order to build components of your web page. So, this is what's referred to as a data join. And what this actually does is to take data that you give to the JavaScript library and it builds components of your web page using this data. And it also binds your data to these components. So in this case this whole complicated piece of code here is effectively selecting the body, selecting all paragraphs, of which you don't even need to have any paragraphs initially. Then you tell it I want to append and join this data, which in this case is an array of six numbers, 10, 20, 30, 40, 50, 60. You enter the elements you want, so your paragraphs are entering the stage, as it's called, and then you append these paragraphs. And here, so effectively, at this point, you will have created six paragraphs. And then what you can do next is use the data, the individual elements of this data array, which are given by d, to control the text of the paragraph, and also the font size. So, effectively, a data join links data to components of your web page. And I'll show you how this works. So first thing I'm going to get rid of this paragraph so we have no paragraphs. Then I'm going to select body. And then I'm going to select all paragraphs. And then going to, Enter, sorry. Going to write out the data that I have. So 10, 20, 40, 50, 60. And Enter, append paragraphs. So what this is going to do is D3 is going to work like a for loop. It's going to create six paragraphs based on the array that I have here. So if I run this, I should get six paragraphs. So the next thing is to use the data to control the text of the paragraph. So we're sort of using, all these examples I'll be showing you are not actually building a sort of formal visualization. But it shows you how you can build a visualization with something as simple as a paragraph in this case. You'll see how these paragraphs evolve more towards a visualization as I go on. So what you could do here is programmatically define the text. So we'll do that by writing I'm number, so make this a bit smaller. I'm number d, exclamation points. So this should take each of these elements of this array, and change the text based on the element of the array. Which doesn't do. Text. Oh, sorry. Return. Okay, so what we're doing here is writing an anonymous function. That takes in an argument d, which loops through this array. And this d is used to write a string, which is then returned as the argument to the text here. And this creates this set of six paragraphs. So, we're controlling the visual component of the paragraph, of the text in an abstract way through this data. So the next thing we want to do to make it a little bit more visual is to control the font size. So this is done by using style, font size, and we're going to also write a function here to programmatically control the font size. So let's see, d. I'll do, yeah d + px. So this should control the font size, and it does. So, you're adding this px to tell it that you want each paragraph to have d pixels for font size. So the first paragraph has a font size of ten pixels, second one, 20 and all the way up to 60 pixels. So now you get something that looks a bit more like a visualization. You're actually able to visually see the effect of this data in the text and in the font size. So the next thing, and the last component of D3 that I'll show you here in part one is the transition. So D3 makes it very simple to have objects transition in time from one state to another. So here, we're going to use this code to transition the background color of the paragraphs from white to black over a period of one second. So in this code here, we're selecting all the paragraphs that we've already made. And we're initializing the background color to white, telling D3 you want to transition over a duration of 1000 milliseconds, which is a unit, and we want to change the color to black. So again here, we start by doing our selection but this time we do select all paragraphs. I believe it's style, style background. So we set the background color to white. And we tell D3 we want a transition over duration. 1,000 seconds to. So this should cause all these paragraphs to transition from white to black over a period of one second. Which happens. So now the next thing we can do is, since we've already bound data to these paragraphs here, in this first part, when we select these paragraphs we still have access to this data. So I'm going to show you how we can use this data again programmatically to control another visual aspect. So what I'm going to do is put a delay in, a functional delay. And this should use the data. I'll say 50 milliseconds times d. Whoops. So this should use each element d that we have here and delay the transition from white to black based on the number. Now, each of these paragraphs takes longer than the last to transition from white to black. This shows you how you can make a visualization out of very simple components, like paragraphs. And how you can bind data to the paragraphs to control the visual components of the paragraph, in this case, the text, the font size. And also another visual component would be how long it takes for each of these paragraphs to turn from white to black. So that's all for this part one. And thank you for your attention. [MUSIC]