(Kurita) Hello, everyone. (Student) Hello. (Kurita) Welcome to “Interactive Teaching” WEEK 6. This week’s topic is “Evaluations to Promote Learning." In this session, we are going to learn about how to design evaluations. Before starting on the main topic, let me clarify the goal for this week: “Acquire the basic knowledge to evaluate students’ learning, and to comprehend the significance of evaluations and be able to make use of them." We will consider the third objective today: “Be able to explain significant points evaluations and be able to make use of them." Here is the table of contents: evaluation methods, evaluating evaluations, and the wrap-up. Let us begin with evaluation methods. What are the methods for evaluation? You must have gone through various kinds of evaluation as students, so could you share some examples? Let me ask some of you, starting with Hodrigo-san. (Student) There were many tests. (Kurita) What kind of tests? (Student) I mean written examinations. (Kurita) I see. How about you? (Student) I had many tests, too, and also interviews. (Kurita) Evaluation by interviews? (Student) Yes. (Kurita) I see. (Student) I had writing assignments. (Kurita) Evaluation by writing assignments. (Student) I had a task to design some architecture. (Kurita) Design? (Student) I drew a plan and made a model. (Kurita) You made a model. Thank you. You gave us so many examples of evaluation methods. Here is a simple classification of evaluation methods. The vertical axis shows the complexity of the method; the higher it goes, the simpler it becomes. The horizontal axis shows the type of method; the left side shows writing and the right side shows demonstrating. So-called tests, multiple-choice tests and objective examinations are categorized in this section (i.e.“simple” and “writing”). Examinations using a descriptive format are categorized in this section (i.e.“intermediate” and “writing”). Essay writing and writing assignments are categorized in performance-type tasks (i.e.“complex” writing and demonstrating tasks). The sections on the right are for demonstration-type methods. Observation, interviews (as the student mentioned), practical skills tests, and presentations (which are included in the performance-type tasks) are classified in demonstration-type methods. They can be roughly categorized this way. Furthermore, the combination of different methods can be called a portfolio evaluation. When you speak of a portfolio evaluation, a portfolio is not a special kind of material, but is rather the combination of several evaluation methods. These days, the trend has shifted from simple to complex methods; it is considered that students should be evaluated based on whether they are able to use their knowledge and skills in real settings. However, I would also like to let you know that such evaluations are very difficult. Let’s move on to evaluating evaluations. There are four perspectives regarding evaluating evaluations. There are actually more than four, but these four are the basic points I would like you to cover. They are reliability, validity, objectivity, and efficiency. You could use these four points for judging evaluation methods. Let me explain them one by one. Firstly, reliability refers to the reproduction of results or the accuracy of tests. Rather like the pinpoint accuracy of a ruler, it shows to what degree you could obtain the same results no matter how many times you conduct the same examination with the same group. So, it has to do with whether a student could obtain the same marks every time he/she takes the same kind of examination. This is reliability, one of the perspectives for evaluating evaluations. Secondly, validity is the appropriateness of the evaluation method. It refers to whether the evaluation method you have adopted can really measure the skills and behaviors you are focusing on. This is the most important perspective for measuring skills, and is also difficult to accomplish. When you measure English skills, for example, does the examination really measure those skills? Doesn’t it actually measure the tolerance or skill of writing quickly instead of English skills? Even if the method is as reliable as a ruler, isn’t it used for measuring something completely different? Validity has to do with whether you are able to measure what you want to see. The third perspective is objectivity. It is a bit of a technical term, but it refers to how results correspond between the different people who conduct evaluations. It refers to whether the results remain the same even if the grader changes. Subjectivity is the antonym of objectivity. Decisions differ among people because of subjectivity. As Horiuchi-san mentioned regarding interviews, if you pass or fail depending on the interviewers, or if only one of the interviewers says no and fails the applicant, you could say that the method lacks objectivity. The final perspective is efficiency. It refers to the practicality of the method in terms of time and economy. It is not essential for an evaluation itself, but an important point to consider since it has to do with the feasibility of the evaluation method. Is it easy to conduct and give marks? If you chose to examine someone’s life all day or all week long to see if he/she has the required competences of a counselor, you might be able to have a deep understanding of his/her personality, but that kind of method is too impractical, so you need to devise a slightly more efficient method. Efficiency has to do with these kinds of usability. Now we have covered the four perspectives for considering evaluation methods: reliability, validity, objectivity, and efficiency. When you deliver a class and check what the students have learned in your class, you need to choose appropriate evaluation methods by using these four perspectives. Think about the National Center Test; I think most university students are familiar with it. Think about the multiple-choice test from the four perspectives of evaluation; you can put aside the English listening test for now. Think about it for a while. Are you all done? OK, what do you think, Nakamura-san? (Student) Since it is a multiple-choice test, I think the objectivity in giving marks is guaranteed. Also, for the same reason, awarding marks would be done in an efficient way. Reflecting on my own experience as a student preparing for an entrance examination, I scored almost the same when taking tests within a short period of time, so I think it is reliable in that sense. With regard to validity, I think that the National Center Test is aimed at measuring academic achievement, but does it really measure the academic achievement required for graduating from high school? This seems to be quite a controversial issue. (Kurita) I see. Thank you. In general, three of the four perspectives, namely, reliability, objectivity, and efficiency can often be accomplished at the same time. There is a sort of a trade-off relationship between validity and these three perspectives. The more time and effort you put into devising an evaluation method, the better it will become, but the demands on time and the complexity of the method lead to less reliability, objectivity, and efficiency. Speaking of the aforementioned interview method, even if you could improve its validity, it would not mean that everyone could use the method. The scales used for the method might be so wide in range that it might be less efficient and demand more time and effort. In general, the more valid it becomes, the less it is reliable, objective, and efficient, and the more reliable, objective, and efficient it becomes, the less it is valid. Regarding the National Center Test, it might be a measure of tolerance. Not only the National Center Test, but examinations in general are often likely to measure different skills from those targeted. Therefore, from these perspectives, I would like you to review once more the evaluation methods you have experienced or the ones you are planning to incorporate into your course design. Now, let’s wrap up. There is a wide variety of evaluation methods, as I showed you on the graph. You should use different evaluation methods to suit the target and purpose. The perspectives for evaluating the evaluation methods are reliability, validity, objectivity, and efficiency. Choose the appropriate perspectives from these four and choose appropriate evaluation methods. That’s all for this session.