Choosing the right visualization for the job requires first and foremost, understanding what kinds of knowledge the user or the reader needs to build. That is, what are the core tasks that people need to accomplish with their visualizations? In this video, we'll explore and unpack this basic idea of tasks and how task can give us a foundational guideline by which we can start to choose what is the right way to represent our data for a given problem. A lot of the material we're going to cover in this module all retains his basic question of how do people use visualizations? The basic argument here is that if we understand how people use visualizations, then we can choose the best visualization to suit their needs. Let's start by looking at this question at a high level. People tend to start by first assembling a visualization. This visualization may either be provided by the author or configured by the user. Then once they've actually created that visualization, they can use their eyes and any available interaction to scan the data and change and manipulate and update the visualization to learn more about what exactly is going on in their data. As they explore this data, you're generally processing the data that we see and using our cognitive system and visual systems to try to make sense of the information in the patterns we see and pairing that sense-making process with additional knowledge about the data or problem that you were working with. This workflow typically goes, we think something like this scatterplot where we are using a scatterplot or zooming into different clusters. We're trying to make sense of those clusters and understand specific relationships within the visualization. We found our visualization. We explore the data by zooming and panning around the representation. We use that to generate knowledge about specific properties with regard to the happiness and life expectancy of individual countries and rinse and repeat. Well, typically speaking, when we're dealing with these kinds of visualization problems, it's easy to get lost in this almost pipeline ask workflow that we find as we explore the data, we generate knowledge. However, looking at the data at this level of detail over simplifies the problem of using visualizations. Oftentimes, when we generate knowledge, that knowledge that we've generated will create new questions which will cause us to need to further explore the data. Or sometimes when we go to explore the data to try to build this new knowledge, we might need to backtrack even further. We might need to change the visualization to facilitate our ability to really generate this new knowledge. By doing this, what we've actually done is created this cycle where we use our visualization, explore the data, generate knowledge, create new questions with that knowledge, and then rinse and repeat and move throughout different points in the cycle depending on how well the visualization allows us to explore the data in the ways that we need to generate the knowledge that we want to build. What we can do is visualization developers is try to anticipate the kinds of knowledge people want to generate and then design visualizations and interaction techniques that allow people to generate that knowledge. Doing so requires us to break down the knowledge we wish to build into atomic elements known as tasks. When we talk about tasks, tasks are this core reason that our visualization tool is being used. The way I like to think about tasks is in a similar metaphor to functions in programming. There essentially the microscale bits that let us build the macroscale knowledge in the same way, functions are the micro-scale operations that allow us to achieve the macroscale goal of our program. We typically you'd like to try to abstract task as much as possible. We want to try to separate what we want to do with the data from knowledge of the domain. One thing as you start working with people across different disciplines that you'll find is that, for example, people in literary scholarship may be asking the exact same questions of their data as geneticists. Words and a factor, a series of letters, texts are a series of words. Genomes are a series of genes, and genes are a series of the nucleotide base pairs. We often, when we start to take a step back and think abstractly about the kinds of operations people want to do with data. Even vastly different disciplines are asking, at least in the abstract level, the same kinds of questions of their data. We can use task characterization to really understand how we might facilitate this interchange between the data and the user in a way that isn't necessarily biased, or complicated by the specifics of the domain and we can conduct task at different levels of detail. We can characterize task based on these kinds of questions. At the highest level of task tell us, why is the user using the visualization? Why am I as a person looking at this data? We reflect this in the purpose of the visualization. Am I trying to communicate? Am I trying to explore? Is the data merely meant for enjoyment or am I train to confirm my own knowledge of the world? We can take tasks then down one level of detail to focus on this mid level of navigation or what is the user doing? Am I browsing the data to try to find patterns? Am I searching for targeted information? Am I organizing or updating or building unexpected relationships across the data? This is really thinking about tasks in terms of the actions the user takes. At the lowest level, we're thinking about what exactly are those patterns that we're looking for in the data? Are we looking for means? Are we looking for outliers? Are we looking for inversions? Are we looking for peaks? Are we looking for a valleys? What are the statistical attributes that we need to be able to extract from that data in order to build that broader knowledge. One thing I want to encourage you to always do when you're thinking about task, is divorce the idea of the task from the idea of the visualization you want to build. The goal of tasks is to help you choose the right visualization. If I have a visualization in mind, a priority that is going to limit and narrow the set of task's that I think are achievable and how people think they should be achieved. This can lead us to very sub-optimal design decisions. For example, if I'm working with geographic data and I decide that I want to map before I understand what people want to do with that geographic data, I may actually create visualizations that are less effective for the questions I want to answer. For example, if my geographic data is about Electoral College results in the US election. Well, if I want to compare just the sheer count across different states, I probably don't actually need a map for that, because the geography is not doing me any favors, so this is one caveat and one pro-tip when you're thinking about tasks. Think about task before you decide how you're going to represent your data. This idea of thinking about task before we build our data really leads us into this idea of who defines a task. A task should always be defined with respect to the role of the user. When we think about task-driven design, the reason that we think about using tasks, using the knowledge of people need to build to drive the visualizations that we choose is because ultimately at the end of the day, as developers, we're building a tool that we're not going to be driving. If we wanted to be able to define the task on the fly, and we wanted to define the task for ourselves as developers. That's a very different interaction than what visualizations are intended for. Visualizations are intended to let the data speak for themselves. Developers and designers can use tasks to try to define how people will use the tools that they create. The reason we do this is we're trying to anticipate the needs of our users in the process of creating these representations, we're trying to understand and be pro-active about characterizing what is the knowledge that someone would want to build for this tool? End users on the other hand, user test, answer questions about the data. They want to be able to execute on a task and quickly find the answers to the task and answers in their data that they need to build the knowledge that they want to have. One of the things that we can do to try to understand, and try to use task to our advantage is integrate this idea of tasks into the interactions that we might create. We want to think about; how can we actually design interactions that let users use their target questions, or target task, or target knowledge to automatically update the visualization. To give you a quick example, tastants everything we've been talking about so far is very abstract. Let's take this core idea of working with scientists. As a visualization developer, I may be working with a scientist who is trained to understand genome sequences for two organisms. One of the task that they might want to accomplish is to try to explore the data, to assess the similarity between sequences that may be there, high-level objective. They want to build knowledge about how similar are the sequences of these two organisms. We can navigate between an overview level defined large-scale pictures, for example, we can find large elements of the two genomes that agree with each other or disagree with each other, or one thing you'll also find in genomics data is sometimes we'll have two elements of a genome that are exactly matched except for they're presented in the opposite order. We can look at a high level of find those large-scale patterns. We can navigate or drill down to the lower levels of detail. Looking at individual base pairs are As, Cs, Gs and Ts to try to find small-scale changes between sequences. The small-scale changes are known as mutation. This is where a single letter changing can actually change how the genome does what it does. Typically, when we're doing this analysis, one possible thing that scientists might want to do with the knowledge they build from conducting these tasks is to examine differences between the organisms, to understand how the genome might explain the changes or the differences that we see in an organism's appearance, in an organism's ability to do a certain task and an organism's metabolism, choose your favorite biological property that is controlled and regulated by the genome. This might feel like a lot of detail for a task, but I would encourage you to always think about task's in terms of our why, what, when, where, and how. By breaking out a task into its individual components, what we can do is we can understand the task in terms of the context of the data and the domain, but we can also understand the task in terms of how we might design for it. What are the critical needs that our systems and our tools need to be able to serve? For example, here, if we look at are who, what, when, where, why, and how framework, we can see that this task is being conducted by who? It's by scientists. Why are they doing it? They're doing it to explore their data. How are they doing it? There are navigating between different levels of detail. Where are they doing it? They're doing comparisons between sequences. What are they trying to do? They're trying to understand the similarity between the sequences and when are they doing it? They're doing it after they have hypotheses about outcomes. They're also doing it when they want to try to understand the differences between those outcomes. We'll walk through in the next video what each of these who, what, when, where, why, and how components are, and some common examples of how these might manifest in many visualization tasks.