This assigment to fulfill the Subject of Language Testing
Member of Group
Cucun Cuniasih 092122109
Dian Budiana 092122101
Enda Hidayat 092122119
Fitri Suciyanti 092122114
Melia Kurniasari 092122094
FACULTY OF EDUCATIONAL AND TEACHER’S TRAINING
BEYOND TESTS: ALTERNATIVES IN ASSESSMENT
In the public eye, tests have acquired an aura of infallibility in our culture of mass producing everything, including the education of school children. Everyone wants a test for everything, especially if the test is cheap, quickly administered, and scored instantaneously. A more balanced viewpoint is offered by Bailey (1998, p.204):”One of the disturbing things about tests is the extent to which many people accept the results uncritically, while others believe that all testing is individious. But tests are simply measurement tools: It is the use to which we put their results that can be appropriate or inappropriate.”
It is clear by now that tests are one of a number of possible types of assesment. Tests are formal procedures, usually administered within strict time limitations, to sample the performance of a test-taker in a specified domain. Assessment connotes a much broader concept in that most of the time when teachers are teaching, they are also assessing. Assessment includes all occasions from informal impromptu observations and comments up to and including tests.
Early in the decade of the 1990s, in a culture of rebellion against the notion that all people and all skills could be measured by traditional tests, a novel concept emerged that began to be labeled “alternative” assessment. As teachers and students were becoming aware of the shortcomings of standardized tests, “an alternative to standardized testing and all the problems found with such testing” (Huerta-Macias, 1995,p.8) was proposed. That proposal was to assemble additional measures of students-portofolios, journals, observations, self-assessments, peer-assessments, and the like-in an effort to triangulate data about students. For some, such alternatives held”ethical potential” (Lynch, 2001, p.228) in their promotion of fairness and the balance power relationshipa in the classroom.
The defining characteristics of the various alternatives in assessments that have been commonly used across the profession were aptly summed up by Brown and Hudson (1998, pp. 654-655). Alternatives in assessments
1. Require students to perform, create, produce, or do something;
2. Use real-world contexts or simulations;
3. Are nonintrusive in that they extend the day-to-day classroom activities;
4. Allow students to be assessed on what they normally do in class everyday;
5. Use tasks that represent meaningful instructional activities;
6. Focus on processes as well as products;
7. Tap into higher-level thinking and problem-solving skills;
8. Provide information about both the strengths and weaknesses of students;
9. Are multiculturally sensitive when properly administered;
10. Ensure that people, not machines, do the scoring, using human judgment;
11. Encourage open disclosure of standards and rating criteria; and
12. Call upon teachers to perform new instructional and assessment roles.
THE DILEMMA OF MAXIMIZING BOTH PRACTICALITY AND WASHBACK
The principal purpose of this chapter is to examine some of the alternatives in assessments that are markedly different from formal tests. Tests, especially large-scale standardized tests, tend to be one-shot performances that are timed, multiple-choice, decontextualized, norm-refernced, and that foster extrinsic motivation. On the other hand, tasks like portofolios, journals, and self-assessments are:
· Open-ended in their time orientation and format,
· Contextualized to a curriculum,
· Refernced to the criteria (objectives) of that curriculum, and
· Likely to build intrinsic motivation
Even more time must be spent if the teacher hopes to offer a reliable evaluation within students across time,as well as across students (taking care not to favor one student or group of students). But the alternative techniques also offer markedly greater washback,are superior formative measures and because of their authenticity,usually carry grater face validity.
Notice the implied negative correlation:as a technique increases in its washback and authenticity, its practically and reliability tend to be lower.conversely, the greater the practicallity and reliabillity,the less likely you are to achieve beneficial washback and authenticity.
The figure appears to imply the inevitabillity of the relationship:large scale multiple choice test cannot offer much washback or authenticity,nor cnan portfolios and such alternatives achieve much practicallity or reliabillity.surely we should not sit idly by , accepting the presumably inescapable conclusion that all standardized test will be devoid of washback and authenticity.A number of approaches to accomplishing this end are possible,many of which have already been implicity presented in this book:
· Building as much authenticity as possible into multiple choice task types and items.
· Designing classroom test that have both objective scoring sections and open ended response sections, varying the pervormance tasks.
· Turning multiple choice test results into diagnostic feedback on areas of needed improvment.
· Maximazing the preparations period before a test to elicit performance relevant to the ultimate criteria of the test.
· Teaching test taking strategies.
· Helping students to see beyond the test:don’t “teach to the test”.
· Triangulating information on a student before making a final assessmentof competence.
As we look at alternatives in assessment in this chepter, we must remember Brown and Hudson’s (1998) admonition to scrutinize the practicallity,reliabillity,and validity of those alternatives at the same time thet we celebrate their face validity,wasback potential,and authenticity. Assessment proposed to serve as triangulating measures of competence imply a responsibillity to be rigorous in determining objectives response modes,and criteria for evaluation and interpretation.
PERFORMANCE BASED ASSESSMENT
There has been a great deal of press in recent years abuot performance based assessment,sometimes merely called performance assessment (Shohamy,1995;Norris et al.,1998).
The push toward more performance based assessment is part of the same general educational reform movement that has raised strong objections to using standardized test scores as the only measures of student competencies (see for example, Valdez Pierce & O’Malley,1992;Shepard &Bliem,1993).
Performance-based assessment implies productive, observable skills, such as speaking and writing, of content-valid tasks. Such performance usually, but not always, brings with it an air of authenticity-real world tasks that students have had time to develop.
O’Malley and Valdez Pierce (1996) considered performance-based assessment to be a suset of authentic assessment. In other words, not all authentic assessment is performance-based. One could infer that reading, listening, and thinking have many authentic manifestations, but since they are not directly observable in and of themselves, they are not performance-based. According to O’Malley and Valdez Pierce (p. 5), the following are characteristics of performance assessment:
1. Students make a constructed response
2. They engange in higher-order thinking, with open-ended tasks
3. Tasks are meaningful, enganging, and authentic
4. Tasks call for the integration of language skills
5. Both process and product are assessed
6. Depth of a student’s mastery is emphasized over breadth
Performance-based assessment needs to be approached with caution. It is tempting for teachers ro assume that if a student is doing something, then the process has filfilled its own goal and the evaluator needs only to make a mark in the grade book that say “accomplished” next to a particular competency. In reality, performances as assessment procedures need to be treated with the same rigor as traditional tests. This implies that teachers should
· State the overall goal of the performance,
· Specify the objectives (criteria) of the performance in detail,
· Prepare students for performance in stepwise progressions,
· Use a reliable evaluation form, checklist, or rating sheet,
· Treat performances as opportunities for giving feedback and provide that feedback systematically, and
· If possible, utilize self-and peer-assessments judiciously.
To sum up, performance assessment is not completely synonymous with the concept of alternative assessment. Rather, it is best understood as one of the primary traits of the many available alternatives to assessment.
According to Genesee and Upshur (1996), a Portfolio is a purposeful collection of students’ work that demonstrates....their efforts, Progress, and Achievments in given areas” (p.99) Portfolio include materials such as :
· Essays and comositions in draft and final forms ;
· Reports, project outlines;
· Poetry and creative prose;
· Artwork, photos, newspaper or magazine clippings;
· Audio and /or video reecordings of presentations, denonstrations, etc.;
· Tests, test scores, and written homework exercise;
· Nots on lectures; and
· Self and peer-assesments-comments, evaluations, and checklist.
Gottlieb (1995) suggested a developmental scheme for considering the nature and purpose of portfolios, using the acronym CRADLE to designate six possible attributes of portfolio :
The advantages of engaging students in portfolio development have been extolled in a number of sources (Genesee & Upshur, 1996 , O’Mally & Valdez Pierce. 1996: Brown & Hudson, 1998; Weigle. 2002) A synthesis of those characteristics gives us a number of potential benefits. Portfolios
· Foster intrinsic motivation, responbility, and ownership.
· Promote student-teacher interaction with the teacher as facilitator.
· Individualize learning and celebrate the Uniqueness of each student.
· Provide tangible evidence of a student’s work.
· Facilitate critical thinking, self –assesment, and revision process.
· Offer opportunities for collaborative work with peers, and
· Permit assesment of multiple dimensions of language learning.
Succesful Portfolio development will depend on following a number of steps and guidelines.
1. State objectives clearly. Pick one or more of the CRADLE attributes named above and specify them as objectives of developing a portfolio.
2. Giving guidlines on what materials is to include. One the objectives have been determined, name the types of workthat should be included.
3. Communicative assesment criteria to students. This both the most important aspect of portfolio development and the most complex.
Portfolio self-assessment questions (O’Malley and Valdez Pierce (1996), p. 42)
1. Look at your writing sample
a) What does the sample show that you can do ?
b) Write about what you did well
2. Think about realistic goals. Write one thing you need to do better. Be specific.
Genesee and Upshur (1996) recomended using a questionaire format for self-assessment, with questions like the following for a project :
Portfolio self-assessment questioaire
1. What makes this good or intersting project ?
2. What is the most interesting part of the project ?
3. What was the most difficult part of the project ?
4. What did you learn from the project ?
5. What skillsdidi you practices when doing this project?
6. What resource did you use to complete this project ?
7. What is the best part of the project? Why?
8. How would you make the project better ?
4. Designate time within the curriculum for portfolio development..
5. Establish Periodic schedules for review and confrencing.
6. Designate and accesible place to keep portfolios. It is convernient for student to carry collections of papers and artwork.
Fifty years ago, journals had no place in the second language classroom. A journal is a long (or “account”) of one’s thoughts, fellings, reactions, assessment, ideas or progress toward goals, usually written with little attention to structure, form, or correctness. Models of journal use in educational practice have sought to tighten up this style of journal in order to give them some focus (staton et al,1987). The result is the emergence of a number of overlapping categories or purposes in journal writing, such as the following :
· Language learning logs
· Grammar journals
· Respones to readings
· Strategies-based learning logs
· Self-assesment reflections
· Diaries of attitudes, feelings, and other affective factors
· Acculturation logs
Most classroom-oriented journals are what have now come to be known as dialogue journals. Through dialogue journals, teachers can become better acquainted with their studennts, in terms of both their learning progress and their affective states, and thus become better ecquipped to meet students individual needs.
With the widespread availability of internet communications, journals, and other students-teacher dialogues have taken on a new dimenssion. Journals obviiously serve important pedagogical purpose: practice in the mechanics of writting, using writting as a ‘thinking’ process , individualization, and communication with the the teacher.
It is important to turn the advantages and potential drwabacks of journals into positive general steps and guidelnes for using journals as assessment instruments.
1. Sensitively introduce students to the concept of journal writting. For many students, especially those from educational system that play down the notion of teacher-students dialogue and collaboration
2. State the objective (s) of the journal
3. Give guideelines on what kinds of topics to include
4. Carefully specify the criteria for assessing or grading journals
5. Provide optimal feedback in your responses
6. Designate appropriate time frames and schedules for review
7. Provide formative, washback-giving final comments
CONFERENCES AND INTERVIEWS
Reference was made to conferencing as a standard part of the process approach to teaching writing, in which the teacher, in a conversation about a draft, facilitates the improvement of the written work. Such interaction has the advantage of one-on-one interaction between teacher and student and the teacher’s being able to direct feedback toward a student’s specific needs.
Conferences are not limited to drafts of written work. Including portfolios and journals discussed above, the list of possible fnctions and subject matter for conferencing is substantial:
· commenting on drafts of essays and reports
· reviewing portfolios
· responding to journals
· advising on a student’s plan for an oral presentation
· assesing a proposal for a project
· giving feedback on the results of performance on a test
· clarifying understanding of a reading
· exploring strategies-based options for enhancement or compensation
· focusing on aspects of oral production
· checking a student’s self-assessment of a performance
· setting personal goals for the near future
· assessing general progress in a course
Conferences must assume that the teacher plays the role of facilitator and guide, not of an administator, of a formal assesment. So that the student will be as a candid as posible in self assesing , the teacher should not consider a conference as something to be scored or graded. Conference are by nature formative, not summative, and their primary purpose is to offer positive washback.
Discussions of alternatives in assesment ussually encompass one specialized kind of conference: an interview. This term is intended to denote a context in which a teacher interviews a student for a designated assessment purpose. (we are not talking about a student conducting an interview of others in order to gather information on a topic.) interviews may have one or more of several possible goals, in which the teacher
· assesses the student’s oral production,
· ascertains a student’s needs before designing a course of curriculum,
· seels to discover a student’s learing styles and preferences,
· seeks to discover a student’s learning styles and preferences,
· asks a student to assess his or her own performance, and
· requests an evaluation of a course
One overriding principle of effective interviewing vcenters on the nature of the questions that will be asked. It is easy for teacher to assume that interviews are just informal conversations and that they need little or no preparation. To maintain the all-important reliability factor, interview question should be constructed carefully to elicit as focused a response as possible.
Because interviews have multiple objectives, as noted above, it is difficult to generalize principles for conducting them, but the following guidelines may help to frame the questions efficiently:
1. Offer an initial atmosphere of warmth and anxiety-lowering (warm-up).
2. Begin with relatively simple questions
3. Continue with level-check and probe questions, but adapt to the interviewee as needed.
4. Frame questions simply and directly.
5. Focus on only one factor for each question. Do not combine several objectives in the same question.
6. Be prepared to repeat or reframe questions that arre not understood.
7. Wind down with friendly and ressuring closing comments.
How do conferences and interviews score in terms of principles of assesment are their practicality, as is true for many of the alternatives to assessment, is low because they are time-consuming. Reliability will vary between conferences and interviews. In this case of conferences, it may not be important to have rater reliability because the whole purpose is to offer individualized attentin, which will vary greatly for student to student. For interviews, a relatively high level of reliability should be maintained with careful attention to objectives and procedures. Face validity for both can be maintained at a high level due to their individualized nature. As long as the subject matter of the conference/interview is clearly focused on the course and course objectives, content validity should also be upheld. Washback potential andd authenticity are high for conferences, but possibly only moderate for interviews unless the results of the interview are clearly folded into subsequent learning.