You are on page 1of 51

G

N
I
K
A
E
P
S
G
N
I
S
E
S
S
A
rt
elfo
B
a
m
Em

Introduction
This presentation focuses on the assessment of
oral skills and is based on H. Douglas Brown s
treatment of the subject as detailed in his book,
Language assessment: principles and classroom
practices published in 2004 by Pearson
Longman.

First challenge:
speaking vs. other skills
Listening and speaking almost always correlated
Only in very limited contexts (monologues, speeches,
story-telling, and reading aloud) can oral language be
assessed without the aural participation of an
interlocutor.
Observations invariably tainted by other skills
Speaking is almost always colored by the accuracy and
effectiveness of test-takers reading comprehension or
listening

Second challenge:
design and elicitation techniques
Most speaking product of creative construction of linguistic strings where
the speaker makes choices of lexicon, structure and discourse, as tasks
become more and more open-ended the freedom of choice given to testtakers creates a challenge in scoring procedures; therefore:
The stimulus used to elicit the target response for a particular category
must be designed in a way that impairs test-takers from avoiding or
paraphrasing and thereby dodging production of the target form.
In receptive performance: elicitation stimulus can be structured to
anticipate predetermined responses and only those responses.
In productive performance: oral or written stimulus must be specific
enough to elicit output within an expected range of performace such that
scoring procedures apply appropriately.

Taxonomy for oral production


Imitative
Intensive
Responsive
Interactive
Extensive (monologue)

Micro- and microskills of speaking


Microskills: smaller chucks of language such
as phonemes, morphemes, words,
collocations, and phrasal units.
Macroskills: larger elements such as fluency,
discourse, function, style, cohesion, nonverbal
communication, and strategic options

Microskills of oral production


These skills total 11different objectives to assess
speaking
1.Differences among English phonemes and allophonic variants
2.Chucks of language of different lengths
3.English stress patterns, words in stressed and unstressed positions,
rhythmic structure, and intonation contours
4.Reduced forms of words and phrases
5.Lexical units (words) to accomplish pragmatic purposes
6.Fluent speech at different rates of delivery

Microskills of oral production (ctd.)


7. Monitor ones own oral production and use various strategic
devices pauses, fillers, self-correctors, backtarcking to
enhance the clarity of the message.
8. Use grammatical word classes (nouns, verbs, etc.), system
(e.g., tense, agreement, pluralization), word order, pattern,
rules, and elliptical forms.
9. Produce speech in natural constituents: in appropriate phrases,
pause groups, breath groups, and sentence constituents.
10.Express a particular meaning in different grammaticall forms.
11.Use cohesive devices in spoken discourse.

Macroskills of oral production


These skills total 5 different objectives to assess speaking
12. Communicative functions according to situations, participants, and goals.
13. Sociolinguistic features used in face-to-face conversations: styles, registers,
implicature, redundancies, pragmatic conventions, conversation rules, floorkeeping and yielding, interrupting, and.
14. Links between events and communicative relations as focal and peripheral
ideas, events, and feelings, new information and given information,
generalization and exemplification.
15. Facial features, kinesics, body language, and other nonverbal cues
16. Speaking strategies: emphasizing key words, rephrasing, providing a
context for interpreting the meaning of words, appealing for help, and
accurately assessing how well your interlocutor is understanding the
speaker.

Most common tecniques and related


tasks
As we review these techniques three important issues must be
considered when designing tasks:
No speaking task is capable of isolating the skill of oral
production.
Designer must make sure the elicitation prompt achieves its
aims as closely as possible.
Carefully specify scoring procedures so as to achieve as high
a reliability index as possible.

Assessing Imitative speech


Task: to repeat the stimulus, which can be a
pair of words, a sentence, or a question (to
test for intonation production) with items
focusing on a specific phonological criterion. A
variation involves prompting test-takers with a
brief written stimulus to read aloud.
Drawbacks: there is a potential negative
washback effect, also it should not occupy a
dominant role in an overall oral production
assessment

Example Word repetition task

Scoring Scale:

Aceptable pronunciation

Comprehensible, partially correct pronunciation

Silence, seriously incorrect pronunciation

Assessing intensive speaking


Test-takers are prompted to produce short stretches of discourse (no more
than a sentence) through which they demonstrate linguistic ability at a
specified level of language. Many tasks are cued in that they lead the
test-taker into a narrow band of possibilities. Some of the techniques
include
Directed response tasks
Read-aloud tasks
Sentence/dialogue completion tasks and oral questionnaires
Pictured-cued tasks
Translation (of limited streches of discourse)

Assesing intensive speaking:


Directed response tasks
These are mechanical, non-communicative tasks; but
they do require minimal processing of meaning in order
to produce the correct grammatical output.

Example:

Assessing intensive speaking:


Read-aloud tasks
These tasks include reading beyond the sentence level up to
a paragraph or two.
Advantages: predictable output, practicality, reliability in
scoring
Disadvantages: somewhat inauthentic in that we seldom
do this in real life, also this skill calls for certain
specialized oral abilities which may not be reliable
indicators of test-takers pragmatic ability to
communicate orally in face-to-face contexts

Example: Read-aloud task


Prators (1972) Manual of American English Pronunciations diagnostic
passage:test-taker read aloud into a recorder, scoring based on a number of
phonological factors (vowels, diphthongs, consonants, consonant clusters,
stress, and intonation) with a two-page diagnostic checklist on which all errors
and questionable items were noted and a four point scale for pronunciation
and for fluency.

Test of Spoken English scoring scale (1987)

Assessing intensive speaking:


Sentence/dialogue completion tasks
Include reading a dialogue in which one speakers lines have been omitted. Other
examples are form filling (Underhill 1987) or oral questionnaires.While
individual variations in responses are accepted, this technique taps into a
lerners ability to discern expectancies in a conversation and to produce
sociolinguistically correct language
It could be contended that performance on these items is responsive rather than
intensive but notice there is a degree of control which predisposes the test-taker
to respond with certain expected forms. In any event, according to Brown this
argument undersacores the fine lines of distinction between the five categories
for assessing spoken language Brown (2004).
Advantages: moderate control of output, the written format allows a little bit more
time for the test-taker to anticipate the answer and removes the potential
ambiguity created by aural misunderstanding.
Disadvantages: the contrived, inauthentic nature of this task and the fact that it
relies on literacy and an ability to transfer easily from written to spoken English.

Example: Dialogue completion task

Example: Sentence completion task

Assessing intensive speaking:


Picture-cued tasks
Designed to elicit a word or phrase. Pictures may be very simple, somewhat more elaborate and
busy, or composed of a series that tells a story or incident.
Other cues: Maps to give instructions, directions, and specify locations.
Other techniques: pairing two test-takers supplied with identical sets of numbered pictures. One
test-taker is cued to describe one of the pictures in as few words as possible for the other test-taker
to identify.
Advantages: help to unlock the almost ubiquitous link between listening and speaking performance
and remove the potential ambiguity created by aural misunderstanding.
Disadvantages: the inauthentic nature of this task and the fact that it relies on literacy and an
ability to transfer easily from written instructions to spoken English.
Also, although this technique is quite versatile but it can be heavily dependent on very clear written
instructions.
Scoring: may be problematic depending on the expected performance

Assessing intensive speaking:


Picture-cued tasks (ctd.)
Picture based tasks are very popular to elicit oral language performance and can be
used not only for intensive production but also for extensive output. When scoring
multiple factors recordings of the test-takers productions are very useful to the grader.
Types of language that can be elicited using pictures:
minimal pair
comparatives
verb tenses
nouns, negative responses, numbers, and location
giving directions and instructions
elaborate responses and descriptions

Example: Picture-cued elicitation task

minimal pairs

comparatives
Intuitive: do not rely on written instructions

Example Picture-cued elicitation task


Brown & Sahni, 1994

These pictures need clear written instructions as they could be misleading


and confusing without them.

Example Picture-cued elicitation task

Brown & Sahni, 1994

Assessing intensive speaking:


Translation (of limited stretches of
According to Brown, translation methods are certainly
pass in todays communicative classroom; but he
concedes that in countries (such as Venezuela) where
English is still not a prevailing language translation plays a
meaningful communicative device for the English learner.

discourse)

This technique involves test-takers being given a native


language word,phrase,or sentence and are asked to
translate to the English equivalent.
Advantages: control of the output which of course means
that scoring is more easily accomplished.

Assessing responsive speaking:


Differs from intensive tasks in the increased creativity
given to the test-taker and from interactive tasks by the
somewhat limited length of utterances.
Involves brief interactions with an interlocutor
Some of the techniques commonly used include:
Question and Answer
Giving instructions and directions
Paraphrasing

Assessing responsive speaking:


Questions and Answers
These tasks can consist of simple and complex questions from an interviewer or they
can make up a portion of a whole battery of questions and prompts in an oral
interview. There are two types of questions
Display questions: this is the first question and it intensive in its purpose (as we
have seen previously, this questions are designed to elicit a predetermined correct
response).
Referential questions: through these questions the test-taker is given opportunity
to produce meaningful language in response.

In designing referential questions it is important to keep in mind why the question is


being asked; is it to elicit a string of language output or is it to gain a sense of the testtakers discourse competence?
Oral interaction with a test administrator often involves the latter asking all questions.
An alternative of this concept is to elicit questions from the test taker.
One technique involves more than one test-taker with an interviewer. With two
students in an interview context, both test-takers can ask questions of each other.
This technique might meet practicality requirements but it might be troublesome to
score.

Example: Question and answer task


Questions eliciting open-ended responses

Elicitation of questions from the test-taker

Assessing responsive speaking:


Giving instructions and directions
This technique is simple, the administrator poses the problem and the
test-taker responds.
Task should require the test taker to produce at least 5 or 6 sentences
Topics need to be familiar (not beyond the content schemata of the
test-taker), so that an impromptu delivery is attainable, this avoids
having to supply the problem in advance which in turn guarantees the
test-taker does not parrot back a memorized set of sentences.
Advantages: Using this type of stimulus provides an opportunity for
the test-taker to engage in a relatively extended stretch of discourse,
to be very clear and specific, and to use appropriate discourse markers
and connectors.
Scoring: based primarily on comprehensibility and secondarily on
other specified grammatical or discourse categories.

Example: Giving instructions and


directions task

These tasks can be designed to be simple or complex, potentially placing it in


the category of extensive speaking. Objectives must be clearly set if the
purpose is to elicit a short and simple response directives must be clear so as
not to take the test-taker down a path of complexity for which she or he is not
prepared.

Assessing responsive speaking:


Paraphrasing
This tasks require the test-taker to read or hear a limited number of
sentences and produce a paraphrased version of the discourse.
It is important to pinpoint the objective of the task clearly. In these
tasks the integration of listening and speaking is probably more at
stake than simple oral production alone.
Advantages: elicit short stretches of output and perhaps tap into
test-takers ability to practice the conversational art of conciseness
by reducing the output/input ratio.
Some of the contexts that may be assessed include: Describing,
comparing and contrasting, narrating, summarizing, giving an
opinion, supporting an opinion, hypothesizing, defining, functioning
interactively.

Assessing interactive speech:


Include long stretches of interactive discourse. The difference between
these types of oral production assessment and responsive speech is the
length and complexity of the expected output. Can take two forms:
Transactional language: to exchange specific information
Interpersonal exchanges: social exchanges and relationships
Some of the techniques commonly used include:
Interviews
Role plays
Discussions
Games

Assessing interactive speech:


Interview
This technique involves a test administrator and a test-taker sitting down in a direct
face-to-face exchange and proceeding through a protocol of questions and directives.
Interview can vary in length, depending on their purpose:
Placement interview: designed to get a quick spoken sample from a student in order
to verify placement into a course, may need only five minutes - if the interviewer is
trained to evaluate the output accurately.
Comprehensive interviews: designed to cover predetermined oral production
contexts and may require the better part of an hour.
A variation is to place two test-takers during one interview, the advantages of this
technique are the opportunity for student-student interaction which increases
authenticity and the practicality of scheduling twice as many candidates.
The
disadvantages are equalizing the output between two test-takers, discerning the
interaction effect in case of unequal comprehension and production ability, and scoring
two people simultaneously.
Scoring: based on a set of parameters which may include accuracy in pronunciation,
grammar, vocabulary usage, fluency, sociolinguistic/pragmatic appropriateness, task
accomplishment, and even comprehension. Scoring can be facilitated by recording the
interview.

Assessing interactive speech:


Interview (ctd.)
Disadvantages: open-ended and involves a significant level of interaction
where the interviewer is forced to make judgments that are susceptible to
unreliability. Accuracy in scoring can be improved with careful attention to
the linguistic criteria being assessed as well as through experience and
training of the administrators to develop a sound judgment.
The success of an oral interview will depend on:
clearly specifying administrative procedures of the assessment (practicality)
focusing the questions and probes on the purpose of the assessment
(validity)
appropriately eliciting an optimal amount and quality of oral production
from the test-taker(biased for best performance)
creating a consistent, workable scoring system (realibility)

Assessing interactive speech:


Interview (ctd.)
According to Brown, every
effective interview contains a
number of mandatory stages.
Two decades ago Michael
Canale (1984) proposed a
framework
which
has
withstood the test of time.
Canale suggested that testtakers will perform at their best
if they are led through four
stages:

Example: Interviews

Assessing interactive speech:


Role play
This
is a
popular
pedagogical
activity
communicative language teaching classes.

in

Advantages: it can be controlled or guided by the


interviewer while freeing students to use discourse
that might otherwise be difficult to elicit allowing
test-takers to go beyond simple intensive and
responsive levels to a level of creativity and
complexity that approaches real-world pragmatics.
Scoring: presents the usual complications as any
task that elicits somewhat unpredictable responses
from test-takers.

Assessing interactive speech:


Discussions and conversations
Difficult to specify and even more difficult to score.
Advantages: as informal techniques to assess
learners, they offer a level of authenticity and
spontaneity that other assessment techniques may
not provide.
Discussion is a integrative task, so it is advisable to
give
some
cognizance
to
comprehension
performance in evaluating learners.
Scoring: checklists should be carefully designed to
suit the objectives of the observed discussion

Assessing interactive speech:


Discussions and conversations (ctd.)
Discussions may be specially appropriate tasks
through which to elicit and observe such abilities as:

Assessing interactive speech:


Games
Among informal assessment devices are a variety of games that
involve language production. Some examples include:
Tinkertoy game
Crossword puzzles
Information gap grids
City maps
Advantages: as informal techniques to assess learners, they
offer a level of authenticity and spontaneity that other
assessment techniques may not provide.
Scoring: checklists should be carefully designed to suit the
objectives of the observed discussion

Example: Games

Assessing extensive speech:


These tasks involve complex, relatively lengthy stretches of
discourse. They are frequently variations on monologues, usually
with verbal interaction from listeners or an interlocutor being either
highly limited or ruled out all together. Some commonly used
techniques include
Some of the most commonly used techniques include:
Speeches and oral presentations
Pictured cued story-telling
Retelling a story or news event
Translation (of extended prose)

Assessing extensive speech:


Oral Presentations
These tasks consist of having the test-taker present a report, a
paper, a marketing plan, a sales idea, a design of new product,
or a method.
Scoring: checklist and grid are common means of scoring these
tasks.
Scoring is the key assessment challege for oral
presentations so the rules for effective assessment must be
invoked
Specify the criterion clearly
Set appropriate tasks
Carefully elicit optimal output
Establish practical,reliable scoring procedures

Assessing extensive speech:


Oral Presentations (ctd.)

Assessing extensive speech:


Picture-cued story-telling
These tasks are similar to those we reviewed for assessing intensive
production. The object is to elicit oral production through visual
cues. Some of the stimuli used include:
Pictures
Photographs
Diagrams
Charts
Series of pictures for longer descriptions
Scoring: criteria need to be clear about what is being assessed. For
example it is insufficient to specify the objective as aiming to elicit
narrative discourse. This must be further clarified by deciding
whether the assessment is evaluating oral vocabulary, time
relatives, sentence connectors, past tense of irregular verbs, etc.

Example: Picture-cued elicitation task


for extensive production
Possible questions:
1. Who is eating?
2. Who is drinking?
3. Who is talking?
4. What is she doing?
In applying questions it is important to
know the purpose of each question.

Brown & Sahni, 1994

The purpose of the first three questions


is to cue the test-taker toward inferring
what the woman next to the table could
be doing.

Example: Picture-cued elicitation task


for extensive production
Brown & Sahni, 1994

This task elicits more open-ended


performace whereby test-takers have
to elaborate with their own opinion,
describe preferences, and accomplish
a persuasive function. These tasks
must have a clearly defined criteria of
goals and scoring rubric
Rubrics could include:
Grammar
Vocabulary
Comprehension
Fluency
Pronunciation
Task accomplishment (persuasive?)

Assessing extensive speech:


Retelling a story or news event
In these tasks test-takers hear or read a story or news
event that they are asked to retell.
The difference from the paraphrasing is longer stretches
of discourse and a different genre.
Scoring: the most significant challenge as with all
extensive production assessments, therefore it should be
designed to meet a clear set of criteria.
Some commonly used rubrics include communicating
sequences and relationships of events, stress and
emphasis patterns, expression in the case of a dramatic
story, fluency, and interaction with the hearer.

Assessing extensive speech:


Translation (of extended prose)
Longer texts are presented for the test-taker to read in the native
language and then translate into English. Some of examples of texts
include:
Dialogues
Directions for assembly of a product
A synopsis of a story or play or movie
Directions on how to find something on a map
Advantages: is in the control of the content, vocabulary, and to some
extent, the grammatical and discourse features.
Disadvantage: as we know, translation of text is a highly specialized
skill for which some individuals obtain advanced degrees.
Scoring: criteria should therefore take into account not only the
purpose in eliciting a translation but the possibility of errors that are
unrelated to oral production ability.

Final comments

Oral Proficiency scoring categories (Brown 2001)


Phonepass (imitative and intensive) vs.
TSE (responsive and interactive) vs.
OPI (oral interview)

References
Brown, H.D. Language Assessment: Principles
and Classroom Practices. (2004). Longman

You might also like