Название: The Concise Encyclopedia of Applied Linguistics
Автор: Carol A. Chapelle
Издательство: John Wiley & Sons Limited
Жанр: Языкознание
isbn: 9781119147374
isbn:
Speaking Assessment Methods
Clark's (1979) classification of language assessment methods as indirect, semi‐direct, and direct has proven useful for understanding speaking assessment methods. O'Loughlin (2001) explains,
Indirect tests generally refer to those procedures where the test taker is not actually required to speak and belong to the “precommunicative” era in language testing. Examples of this kind of procedure are the pronunciation tests of Lado (1961) in which the candidate is asked to indicate which of a series of printed words is pronounced differently from others. (p. 4)
Indirect methods have largely given way to direct and semi‐direct methods.
Direct methods are defined as “procedures in which the examinee is asked to engage in face‐to‐face communicative exchanges with one or more human interlocutors” (Clark, 1979, p. 36), such as an interview in which participants engage in structured or semi‐structured interactions with an evaluator. Speaking assessment methods centered on interviews are collectively referred to as oral proficiency interviews (OPIs). A well‐known OPI is the American Council of Teachers of Foreign Languages oral proficiency interview or the ACTFL OPI (ACTFL, 2009), and many locally developed OPIs are modifications of ACTFL guidelines and elicitation procedures. Common OPI structures involve a series of warm‐up questions followed by increasingly difficult questions, with examinees expected to display increasing levels of complexity in their responses. Interviewers may elicit a preselected set of responses, decide to follow up on topics or comments that the participant has introduced, or both. Examinee performance may be rated simultaneously by the interviewer or by an additional rater who rates as the interview proceeds. When an audio or video recording is made, responses can be rated after the interview is completed.
A variation of the direct method may require examinees to give a presentation on a selected topic, which often includes face‐to‐face engagement with members of an audience who pose follow‐up questions. Performance tests that require examinees to teach remain popular in international teaching assistant (ITA) contexts but the assessment of actual teaching abilities may or may not be included in the final score (Ginther, 2003).
Direct methods have the perceived advantage of their elicitation of speaking skills in a manner that recreates “the setting and operation of the real‐life situations in which proficiency is normally demonstrated” (Shohamy, 1994, p. 100); and have considerable face validity. An important qualification is one that Clark (1979) identified early on: “In the interview situation, the examinee is certainly aware that he or she is talking to a language assessor and not a waiter, taxi driver, or personal friend” (p. 38). Indeed, the fidelity of OPIs to natural conversation has been challenged by a number of researchers (Ross & Berwick, 1992; Johnson & Tyler, 1998), leading others to qualify OPIs as a specific genre of face‐to‐face interaction (He & Young, 1998); nevertheless, research on actual interaction in such tests indicates the genre does share important characteristics with natural conversation (Lazaraton, 1992, 1997).
While OPIs have traditionally been administered with a single interviewer and a single interviewee, speaking assessment of examinees in pairs, or even groups, has attracted growing attention from both researchers and language assessment practitioners (Brooks, 2009; Ducasse & Brown, 2009). The procedure is often referred to as “paired orals” or “group orals.” Such formats hold potential for increased interactivity and authenticity relative to a one‐on‐one interview; however, the added complexity complicates rating. Nevertheless, paired and group oral assessments have successfully been incorporated into large‐scale assessment programs (Hasselgreen, 2005; Van Moere, 2006).
In other testing contexts, semi‐direct methods may be preferred. Semi‐direct methods do not require the presence of an interlocutor to administer the assessment. Examinees are presented with a set of prerecorded questions or tasks, typically under laboratory conditions, and responses are recorded and can be rated later. Advantages of semi‐direct methods are their potential for efficiency, time and cost savings, and high reliability. The absence of a human interlocutor may reduce construct‐irrelevant variance associated with interviewer effects.
Researchers comparing direct and semi‐direct OPI testing methods have reported strong, positive correlations (.89 to .95), leading Stansfield (1991) and Stansfield and Kenyon (1992) to argue that the methods are largely equivalent, statistically speaking. Nevertheless, qualitative analyses have revealed differences in the language produced. Semi‐direct responses tend to display greater formality and more cohesion while being accompanied by longer pauses and hesitations (Shohamy, 1994; Koike, 1998; O'Loughlin, 2001).
Recently, psycholinguistic methods (Van Moere, 2012) and a focus on automaticity have undergone a renaissance. Elicited imitation (EI), which requires examinees to repeat sentences they have heard, is an example of a method that has enjoyed a second look (Yan, Maeda, Lv, & Ginther, 2016), in part because of its effectiveness in distinguishing proficiency levels. Ultimately, Shohamy (1994) concludes that the selection of a direct or semi‐direct method is dependent on four related concerns: accuracy (a function of reliability), utility (the assessment's relation to instruction and the difficulties associated with rater training), feasibility (ease and cost of administration), and fairness. Becker, Matsugu, and Mansoor (2017) also address the balance between practicality and construct representativeness in speaking assessments.
Rating Scales and Scale Descriptors
Assessment of speaking requires assigning numbers to the characteristics of the speech sample in a systematic fashion through the use of a scale. A scale represents the range of values associated with particular levels of performance, and scaling rules represent the relationship between the characteristic of interest and the value assigned (Crocker & Algina, 1986). The use of a scale for measurement is more intuitively clear in domains apart from language ability. For example, we can measure weight very accurately.
Measurement of a speaking performance, however, requires a different kind of scale, such as those used in certain sports competitions (see Spolsky, 1995) where the quality of performance is based on rank. There is no equal‐interval unit of measurement comparable to ounces or pounds that allows the precise measurement of a figure skating performance. Likewise, assessing speaking ranks students into ordinal categories (often referred to as vertical categories) similar to bronze, silver, and gold; A2, B2, C2; or beginning, intermediate, and advanced.
The global assessment of performance is associated with holistic scales where abilities are represented by level descriptors comprised of a qualitative summary of the raters' observations. Benchmark performances are selected to exemplify the levels and their descriptors. Scale descriptors are typically associated with, but not limited to: pronunciation (focusing on segmentals); phonological control (focusing on suprasegmentals); grammar/accuracy (morphology, syntax, and usage); fluency (speed and pausing); vocabulary (range and idiomaticity); coherence; and organization. If the assessment involves evaluation of interaction, the following may also be included: turn‐taking strategies, cooperative strategies, and asking for or providing clarification.
Holistic СКАЧАТЬ