So, now you've conducted your user test and it's time to analyze your results and figure out what it is that you've learned. At this point, if you've done a typical user test, you've conducted five to seven test sessions, which means that you've observed users attempting probably 25 to 50 tasks. You've got four to 10 hours let's say, a video, a pile of logging sheets and a whole bunch of questionnaires. So, you have a whole bunch of data. What are you gonna do to get through it? Well, first, you wanna start out by remembering your goals. Why were you doing this in the first place? Your typical user testing goal, as you'll recall, is to answer the question, can user group X use system Y to do activity Z? Hopefully, you recruited representatives of group X, so people who met the criteria for your testing population. Hopefully, you design tasks that are a good representation of the activity that you're interested in. So, of course you did that, of course you've got that part right. So now the question is, how did they do? Usually, it's not 100 percent clear. Typically, you'll find that some of your participants did well on most of the tasks, some users did well on some of the tasks, and some users didn't really do well on any or most of the tasks. So, the first thing you want to do is you want to collate or collect your baseline statistics. These are things like the task success and failure. So, how many people succeeded and how many failed at each of the tasks that you designed? What kind of errors were made, and how many errors were made? What was the timing? How long did it take people to complete each of the tasks? Then you're going to want to review the critical incidents that occurred. Where did the breakdowns occur? Where did people run into trouble, and why? You're going to want to collate the results from your questionnaires as well, and you're going to want to interpret the responses that you got during the debrief and during the interview portions of your test sessions. Collating success and failure on tasks is usually pretty straightforward, especially if you have very crisp and clear success criteria. You can define whether people succeeded or failed at your tasks, and you can produce, let's say, a bar chart that would look like this. So, the number of participants who succeeded at each task, I mean in this case we see in this hypothetical case, seven participants succeeded at task one, six at task two, three at task three, and one task four. You can do something similar with the time it took participants to complete. In this case, it took people not very long to complete task one and a lot longer to complete task four, and the other task somewhere in between. In some cases, it's useful to communicate that participants had different levels of success in completing tasks. This might be a case where you offer hints or help in some cases, but if you do, you want to indicate that users were less successful than users that didn't need any help or any hints. So, here's an example of a way that that might be presented. In this case, we're actually showing how each user in a particular user test did on each of the tasks in this particular test. You can see that there's different levels of success. Here, green is showing success with no help, yellow exclamation point is showing success with some help or some hints, and then the skull and cross bones is showing cases where users actually failed at a task even after receiving hints. The orange boxes indicate tasks that were not attempted in this case due to technical reasons or timing reasons. In addition to collating the baseline performance statistics, you're going to want to look at the responses to any questionnaires that you administered. So, for example, the scores on the SUS or the System Usability Scale. You're going to want to look at the responses to high level questions such as: what did you think the system did well? where does the system need the most improvement? and so on and so forth. You're going to be most interested in repeated responses or places where multiple participants gave the same or similar answers to these questions. So, you can start to see themes and you can start to see the general responses that you got from different participants. But these baseline statistics and these general observations are only going to tell part of the story, and in fact, they're not going to tell the most interesting part of the story. The most interesting part of the story is, why did these different things happen? Why did people succeed or fail at particular tasks and why did people have the responses that they did to the system and to the experience? In order to explain the outcomes that you observe, you need to go through and review all of the critical incidence that you captured on your recordings and that you logged as you went through and reviewed the recordings. So, you want to consolidate the critical incidents. So, you don't have duplicates of essentially the same thing and places where you see a reoccurrence of the same incident or the same problem, are going to emerge as perhaps more important or more significant than ones that were only experienced by one or two people. Then given this list of incidence or problems, you're going to need to figure out why did those problems happen. So, if a user failed at a task or a user encountered an error, what was it in their interaction and what was it with the system design that caused that to happen? You need to think about what usability principles if any were violated that might help explain why those problems happen. Also think about whether multiple problems stem from the same root causes. Maybe there's a pattern of design throughout the system that is misaligned with users mental models or with what users expect. This is something that's causing a lot of the different problems that you're seeing. For each issue that you identify, you need to assess the severity and use those severities to prioritize the most important or the key findings that you're going to distill and report from your user test. For those key findings that you identify, you're going to go on and add some more detail. You need to clearly describe the problem well enough that somebody else would be able to go in and fix it. You're going to need to provide evidence of that problem, so that somebody who might be skeptical can be persuaded that yes indeed there is a problem because multiple people encountered it. You're going to need to suggest a course of action. Something that you think the system designer should do in order to fix or address this problem. To assess the severity, usually, you're going to want to come up with some kind of rating scale where you can categorize different problems into more or less severe. Shown here is a commonly used rating scale, that ranges from one to four where one indicates a cosmetic problem. Maybe that doesn't have any real usability impact but might be kind of an annoyance, or kind of clunky from user's perspective, all the way up to a four which is a usability catastrophe and is imperative to fix. Something that would prevent users from doing the things that they really need to be able to do in order to use the system successfully. Once you've assessed the severity of all of the findings that you've come up with, you want to identify usually five to ten key findings. If you do many more than that, it dilutes the impact of the findings that you want to present, less than that, maybe it's not as valuable to the recipient of your findings. Key findings of course are going to be the most severe problems that you'll find, the ones that absolutely need to be addressed in the next round of design. These key findings are going to be the things that you highlight in detail in whatever kind of report that you produce in order to communicate your findings to other stakeholders. You're less severe findings are going to be listed in an appendix. They're still valuable, they're just not maybe as important and don't require as much work to highlight and communicate. So, for those key findings, you're going to put some extra work in to describe the problem. The goal is to describe the problem well enough that a designer or developer would know what it is that needs to be fixed. So, you want to indicate for example, on what screen or page, or during what interaction does the problem occur, are there particular conditions under which it occurs, and often, it will be helpful to provide a screenshot or depending on the medium, you might provide a video showing the problem occurring. For each of these key findings, you need to provide evidence. So, you need to describe the critical incidents that led you to believe that there was a problem. You're going to need to indicate if the problem lead to any particular consequences such as task failure, or extended time, or users committing errors, or things like that. You might indicate if you believe that the problem lead to user frustration, and perhaps that frustration was linked to lower subjective scores. So, did people that encountered this problem give lower ratings on the questionnaires or give different answers in the debrief interview? If relevant and if you have access to it, it can be very compelling to provide quotes from the think aloud or from the debrief that indicate the severity or the importance of the problem from the user's perspective. Finally, for each of these key findings, you're going to want to suggest a course of action. Some possible sources for suggestions would be recommending a best practice seen in other products. Have you seen examples of where other products have solved this problem in a way that's elegant, or that you believe would work better for users? You might refer to a design principle or a heuristic, a general principle of good usability practice. If you can find one that would apply in this situation, you can use that to suggest a way that this particular problem could be addressed. Of course, in some cases, you might just see a way that you think would work better. You might have a redesign, you might have an idea of how that problem could be solved that would be more elegant. It's often best if you can back that up with a best practice or a design principle, but if you're particularly persuasive and your ideas particularly compelling, it might be a good suggestion and it might be well received. Sometimes, you don't have a concrete idea about what would fix the problem. So, sometimes, the best suggestion could be more research. You might suggest an additional study to get deeper into that problem to understand better why people were experiencing that problem, or you might suggest an iterative design process wherein multiple solutions are prototyped and tried out to see which one would be the best one. So, you don't always have to suggest a particular way that the design should be improved. Sometimes you can say more research is needed in order to find the best way to fix that problem. A typical thorough user task is often going to find many more than five to 10 problems. So, you'll often have a longer laundry list of maybe a couple dozen or even more problems that you've identified most of which you're not going to focus on as key findings in your report. You did all the work and it would be beneficial for the recipient of your report to see what those problems are because they couldn't get to them if they have time after addressing the more severe problems. So, usually what you'll do, is you'll include them in a table or spreadsheet as an appendix to the report that you would produce. In each of those cases, you're just going to describe where you found the problem, what the description was, and what your rating of the severity was. So, I've talked throughout this lecture about writing a report, or creating a report, or reporting to stakeholders. Some kind of report is typically going to be the outcome of a user test. The specific format is going to be determined by the audience and purpose for that report. So, if you're doing a user test as part of a personal project, you probably don't need to write a detailed formal report. A prioritized list of the issues that you found is probably going to be enough to help you get through the process of fixing them in the next round of design. If you're working with a team and the team is friendly to the idea of user testing, producing a report that lists and details the key findings and the less severe list of issues, is probably going to be enough. If you're working with a team that might be skeptical of the results that you're likely to produce, which can be the case if people are particularly committed to a design direction that you think is maybe not quite the right one, then a more formal report is probably going to be required and you're going to have to put more effort into detailing the evidence for the problems that you found. In the case where you're reporting to an external stakeholder, and this might be a client with whom you're working, or it might be other parts of an organization in which you're working, a formal report is going to be required. Not only are you going to have to emphasize the evidence and provide all the details about your key findings, but you're also going to have to give more weight to the method so that people who don't know you and aren't familiar with what you did, can understand how you got to the results that you got to. So, explaining how you came up with the tasks and why you did it that way, and what was the rationale behind the questions you asked in the questionnaires, and interviews and things like that is going to be important so they can understand where the results came from. The goal of formative user testing which is the user testing we've been focusing on in this course, is to find the biggest problems that need fixing. It's also useful to get an overall picture of how the system is performing in terms of task completion, time efficiency, error prevention, and so forth, but finding the biggest problems is your biggest task of the most important thing that you're going to want to do. To get the most out of user testing, observe the critical incidents that occur and analyze them to figure out the most severe problems. Be sure that you can communicate these problems clearly indicating where and how they occur, providing evidence from your test sessions, and suggesting courses of action. Effective communication is key in UX research. Nobody will benefit from your hard work if nobody understands what you found except for you.