Hello, and welcome back. Today we'll be talking about two types of evaluation. Formative and summative. The key here is before you jump into evaluating your system, you need to understand the big picture. What are your goals? Why are you actually conducting this test? And what do you hope to learn? Frequently, this also depends on how you hope the results will be used by the team. What kinds of claims do you want to make about the system that you build? Now, there are two broad approaches. Broadly, these are formative and summative approaches, and we'll go into each of these in more detail. Different goals lead to different methods used. So, for example, sometimes when you want to prove something, you may want to do an experiment, whether it's a field experiment or a controlled experiment. If you want to kind of understand more about potential issues with your system, you may want to do something more qualitative, such as the usability study or a think aloud method. When you conduct a summative study, you're really focused on learning quantitative measures of usability. So you want to show, you want to prove that your system is actually a usable, good system. Now the actual metrics that you collect may depend on the system. So, for example, if you're hoping to show that you've built a kiosk that's very easy to learn and very easy to use. The first time somebody walks up to it, you may want to actually show measures like time to learn a specific feature. Or how long it takes the user to complete a specific task and learn how to do that. You may also want to show how many errors they had or how many times they had to backtrack while they tried to figure out how to use this specific feature of the kiosk. If you're building a system that somebody's going to use every day, it may be valuable to show that the amount of time that it takes to do a specific activity, specific task with that system is really short, or perhaps shorter than some sort of a baseline. You want to show that your system prevents users from making errors. And you may also want to show that your system is preferred by users, that it leads to higher levels of user satisfaction. Now this frequently relies on actually comparisons to something. Whether it's comparisons to a system that's currently used in the field, or if you're trying to show that a specific feature of the system is better, perhaps comparisons with an older version that doesn't have that feature. Really, what you're trying to get to in the summative evaluation is the results, proving that your system works. On the other hand, in a formative evaluation, you really want to understand the quality of the experience of the usability. So what kinds of tasks are users going to do with this system? What kinds of trouble spots did they run into? And how can you address those trouble spots? You may want to see things, like, how do users understand your system? What is the mental model they have about how they system works? And also, their reactions to specific design decisions or design elements that you've introduced. Now, frequently, I find that formative evaluations are particular good at answering the questions of why. So, why you though your system would be faster than an alternative? You ran a summative evaluation. You find out that it wasn't. Well, why wasn't it? The summative numbers, the quantitative numbers are frequently not good at letting you know exactly what went wrong along the way. By getting into that formative qualitative data, you may be able to see exactly what issues people had along the way. You may be able to interview about their experiences and just to find out more about why the things that happened occurred. In the end, a formative evaluation you are hoping it will help you for a better system. And you're hoping to get them guidance and ideas about next steps to take with your design work. But now I promised that we would come back to this table. So this kind of contrast the two types of evaluation. So in a formative study, you really want to just kind of explore your system. Your frequently getting qualitative, rich data. Because you're not really sure exactly where you want to go with the data, what you want to prove yet. You typically control less about what the user does. Because you want to understand more about their natural ways of approaching the system, kind of see how they go about doing whatever they want to do. It is generally less formal. You typically do this in kind of a design or the prototyping phase. Because, frequently, from a formative evaluation, you find out ways that you still want to change your system. So you'd might not want to do that as you're about to ship a product if you don't have an opportunity to actually make the changes you hope to make. Formative evaluations are typically cheaper, because you need a smaller number of users. You're not trying to go for things like statistical significance. You're just trying to get ideas for future changes. And typically, the tasks are kept relatively open to really provide you with a lot of insight about the places to go next and the kinds of tasks that users might actually want to do with your system. Now, in contrast to summative evaluation, it is really focused on evaluating or showing that your system works. It's much more quantitative. Because, typically, to show that something works better than another thing, you need to compare something directly. And that's frequently a number. You typically control what happens more. Because you really want to make sure that you're comparing apples to apples when you make these comparisons. So you might want users to do the same tasks in two different systems. Because these studies are also generally more formal, because you're trying to get that higher level of control and make sure that everybody's doing the same thing in each version of the study. You're doing this in the testing phase. So your system may be close to complete and you're really just trying to prove that it is, in fact, worth all the time and effort and that somebody maybe should switch to the system that you've built. And of course, it usually costs more money. Because you're recruiting more participants and you're asking them to do something a lot more comprehensive. And so, frequently you do end up having to compensate the participants that come into the lab. And in summative evaluation, user tasks are frequently assigned. Because, again, you want to make sure that your users are doing the same tasks with the two or three systems that you're actually trying to compare. Now, these are not categories that are carved in stone. So for some summative evaluations, you may be collecting some qualitative data. And it's also possible that in a formative evaluation, you may want to gather some quantitative data, for example. It's also maybe possible that your system is so expensive and difficult to set up, or requires so many users to try it out, or that your users need to be so expert to give you feedback about the system, that it still costs a lot of money to even do a formative evaluation. But generally, comparing the two categories, this is what you're going to get. Now, I actually really like to think about these as kind of best of both worlds. So, yes, you want to do a lot of formative evaluations early on in the process. But, sometimes, it's useful to get a little bit of quantitative data to have a better understanding if you're on the right track and if your final summative evaluation is actually going to succeed. And, conversely, when I run a summative evaluation, I also find it useful to gather some qualitative formative data. Because, frequently, things don't work out the you expect it. You may have expected a certain result, let's say in the amount of time it takes to do a task, and you see an unexpected amount. If you collect qualitative data, you can go back to that and really understand why. And that formative evaluation really provides you with a direction for where to go next. And even if you're about to ship the system, at some point there's going to be a version 2.0 of it. And so that formative evaluation really gives you good guidance about where to go next with it. So, I'd say you're perfectly welcome to combine elements of both of these at times, I think lots of evaluations combine elements of both. The key is, just keep an eye on the final goal and make sure that one element doesn't interfere with another. So one example might be, perhaps in your summative evaluation you want to see how long a person takes to do a particular task. Well, you may not want to combine that with think aloud at the moment. Because a person might take a lot of longer if they actually have to explain why they're doing something than if they're just doing a task and trying to be as fast about it as they can. So the one way that you may be able to combine summative and formative in that case is, when you have the person run the task, you have them do it, you have to measure the time, you ask them to do it as fast as they can without thinking aloud. And then perhaps record video of that. Ask them to watch that video later. And explain kind of along the way what they were thinking or why they were going about the task that they were going about in the way they were. So, overall, I think that there's a lot of opportunities to combine the two. But I think it's always important to keep in mind, what are your goals, why are you doing this evaluation, and where can you go with it next? So thank you for joining me for this video, and I hope to see you in the next one.