You are on page 1of 7

Rubrics on Scoring English Tests for Four

Language Skills
Pandiya
A teacher at the Accounting Department, Polines
Abstract: All language programs in instruction/education institution cover
some components like input, process, and output. Input is materials to be
processed, process is teaching and learning program and output is the result
of process. One of teaching and learning programs is assessment or
evaluation of the program. Assessment is carried out by testing/examining
language learners. The test is done by objective test and or subjective test
and referring to four language skills, i.e. listening, speaking, reading, and
writing. Each of them can be done by holistic scoring system or analytic
scoring system. The holistic scoring system is done by scoring language skills
by single score, without referring to every element of language skills. The
analytic scoring system, on the contrary, is scoring every language element of
language skills, then totaling all scores to get final score. This is also done
from the simplest step to the most complex one and from the micro-skills to
the macro-skills. Instructors/teachers then may have one of those alternatives,
and or combine of both.
Keywords: assessment, objective test, subjective test, holistic scoring system,
and analytic scoring system.
INTRODUCTION
The linguists describe language as means of communication. This is something
fundamental about language. At least there are three theoretical views of language. The
first is the view that language is a system of structurally related elements for the coding of
meaning. The second is the functional view, i.e. language viewed as a vehicle for the
expression of functional meaning. The third is interactional view, i.e. language viewed as a
vehicle for the realization of interpersonal relations and for the performance of social
transactions between individuals. Language is seen as a tool for the creation and
maintenance of social relations (Richards and Rodgers, 1992: 17). This emphasizes the
fundamental function of language universally, i.e. language as means of communication.
Other linguists describe the functions of language further. Jeremy Harmer (2007: 76)
described that a language function is a purpose one wishes to achieve when he/she says
or writes something. By performing the function, one is performing an act of
communication. If one says I apologize, he/she is performing the function of apologizing;
the same thing when one says I promise, means he/she is performing the function of
promising. Then for an invitation, there is a rather different expression, i.e. Dyou come to
the cinema? or any expression like Dyou fancy coming around for a meal? Many
functional exponents (patterns or phrases) are exactly the kind of lexical phrases. He
stated further that the students who want to express themselves in speaking and writing,
they need to know how to perform these functions, and even the appropriate given
situations (whether formal or informal). James R Danis (1976: 67) stated that language is
the key to understand a subject, so to understand the subject is to understand its
language. And almost all of what we customarily call knowledge is language, which
means that the key to understand a subject is to understand its language. In fact, that is
Ragam Jurnal Pengembangan Humaniora Vol. 13 No. 1, April 2013

43

rather awkward way of saying it, since it implies that there is such as thing as a subject
which contains language. It is more accurate to say that what we call a subject is its
language. That is biology (for example), other than words? If all the words that biologists
use were subtracted from the language, there would be no biology. This means, of
course, that every teacher is a language teacher. The last is Gladys G Doty and Janet
Ross (1973: 151) stated that language is an instrument to understand the background of
people, activities, and values.
The study of English as a foreign language nowadays becomes more complex and
covers many areas of fields. This will comprise at least three points, i.e. teaching and
learning of English as foreign language (EFL), English for Specific Purposes (ESP), and
the last: English for Academic Purposes (EAP). Teaching and learning of EFL covers
something like English for Adult learners, English for Young learners, English Teaching
and Learning Methods, EFL Curriculum, EFL Teacher Education, and EFL Evaluation.
ESP relates to something like Vocational English, English for Business and Trade, English
for Society Development, English for Mass-media, Literary English, and Propaganda. And
the last: English for Academic Purposes (EAP) relates to something like English in
Articles/Papers, English for Presentation, Seminar, Thesis and Dissertation Writing.
The study of English also relates to the aids for students in the quest for more
knowledge from many international sources. There are two media for language
communication: (1) receiving the message, i.e. reading and listening, (2) and sending the
message, i.e. speaking and writing. The first is also called receptive skills, and the latter is
productive skills. To understand how far the language learners master the materials given
in an instruction program either the formal program or the informal one, the teachers or
educators, and or instructors do some kinds of evaluation activity, i.e. by giving tests for
the language learners. The tests themselves usually cover both oral test and written test,
and four language skills, i.e. listening, speaking, reading, and writing. This paper aimed at
describing any kinds of activity on how to give the tests to the language learners,
especially on scoring test system.
GOOD ENGLISH LANGUAGE TESTS
When discussing the good test, it is an undeniable fact to refer to two sets of good
standard test, i.e. reliability and validity. These two things are very important to determine
whether the tests are good and adequate or not. The test is reliable if that test as a
measurement instrument gives consistent results. In other words the reliability is the
degree of consistency between two measures of the same thing (Mehrens and Lehmann
in Saleh, 2008: 33). The next is the validity of test, i.e. concerning with the degree to
which a test is capable of achieving certain aims. (Mehrens and Lehmann in Saleh, 2008:
33). The aims are usually two, i.e. to describe and predict test takers ability in doing
certain tasks. In other words the validity of test is the accuracy of test to measure what
must be measured. If the teacher wants to know the test takers ability in delivering
speech, the appropriate test is speaking test or oral test, so the written test is not
appropriate. For the English instruction program of institutes either formal or informal, the
kind of test can be standardized test or standards-based test or both. The good
standardized test itself is the product of a thorough process of empirical research and
development. It dictates standard procedures for administration and scoring. And finally, it
is typical of norm reference test, the goal of which is to place test takers on continuum
across a range of scores and to differentiate test takers by their relative ranking (Brown,
2003: 67). The examples of standardized test are TOEFL, TOEIC, IELTS fulfilling the
criteria of specifying a set of competencies (standards) for a given domain, and through a
process of construct validation they program a set of tasks that have been designed to
measure those competencies.(Brown, 2003: 67). For these tests, the instruction institute
just joins and cooperates with other institutes who have authority in conducting those

44

Rubrics on Scoring English Tests for Four Language Skills (Pandiya)

kinds of tests, like Gajah Mada University, and Indonesia Australia Language Foundation
(IALF) in Jakarta or Denpasar. The other alternative is using standards-based test, i.e. the
test composed and prepared by considering any set of curriculum and or syllabus owned
by the institutes themselves (Brown, 2003: 105). And the last is using both standardized
test and standards-based test.
SCORING SYSTEM OF SPEAKING TEST
For this discussion the scoring system specifically tends to be for standards-based
test. Speaking tests vary according to the language elements being assessed and the
objectives of the test. Based on language elements, a language test in general can be
classified into discrete-point test, integrative test, and pragmatic test (Oller in
Mukminatien, 2000: 38). A discrete-point test measures one of the language components,
such as pronunciation, intonation, grammar, vocabulary. An integrative test measures all
the language components at a time. And a pragmatic test measures the learners ability in
using target language for communicative purposes in a given context, i.e. as a functional
speaking test. The functional speaking test can be done in an interactive communication
or transactive one (Brown and Yule in Mukminatien, 2000: 39). The first is referring to the
function of language to maintain social interaction like interview, and role play. The latter
is referring to a type of communication that is focused on conveying the massage, and not
on the interaction, like story telling, giving speech, reading an announcement, presenting
a report, and many others. According to Underhill in Nur Mukminatien (2000: 39) there are
two systems/approaches, i.e. analytic scoring systems and impressionistic system. The
first is scoring the learners speaking ability by separating the components of speaking
skill into sub skills, and the rater scores each component, and then sums the sub scores
into final score. The latter is scoring/judging the learners speaking ability on the basis of
the raters general impression on the learners performance without necessarily separating
the speaking components. Thus, the rater directly comes to a single score without totaling
the sub scores such that in the analytic system. Other experts like Lloyd Jones, White,
Spandel, and Stiggins in Nur Mukminatien (2000: 40) called impressionistic system as
holistic system/approach in writing assessment. For impressionistic system there are
three categories of scoring scale as follows: 0 = inappropriate or seriously incorrect, 1 =
relevant but entirely acceptable, 2 = appropriate and correct.
For analytic system there are six elements/components to be assessed, i.e. fluency,
grammatical accuracy, pronunciation of sentences, pronunciation of words and sounds,
interactive communication, and vocabulary resources. The complete description is given
as follows:
(1) Description of Language Components
No.
1

Language Components
Pronunciation

Grammatical Accuracy

Vocabulary

Fluency

Interactive Communication

Description
1. Pronunciation of individual sounds and words
2. Pronunciation of sentences, the right intonation and
stress
Accurate use of structure, or how the learner gets his/her
utterance correct
The leaners ability in choosing appropriate words and how
to solve the problems when he/she cannot find suitable
words by explaining around the word
1. The ability to keep the conversation going
2. Read a text smoothly without hesitation, or
inappropriate pause, or repeating words/lines
The ability to get the meaning across the listener

Ragam Jurnal Pengembangan Humaniora Vol. 13 No. 1, April 2013

45

(2) Scale Criteria


Scale
0

Proficiency
10 39%

Category
Very Poor

40 50%

Poor

60 70%

Average

75 80%

Good

85 100%

Very Good

Pron
GA
Voc
Flue
IC
Pron
GA
Voc
Flue
IC
Pron
GA
Voc
Flue
IC
Pron
GA
Voc
Flue
IC
Pron
GA
Voc
Flue
IC

Description of Criteria
Many wrong pronunciation
No mastery of sentence construction
Little knowledge of English words
Dominated by hesitation
Massage unclear
Frequent incorrect pronunciation
Major problems in structure
Frequent errors of word choice
Frequent hesitation
Disconnected idea
Occasional errors in pronunciation
Several errors in structure
Occasional errors in word choice
Occasional hesitation
Ideas stand but loosely organized
Some errors in pronunciation
Minor problems in structure
Minor errors in word choice
Minor hesitation
Clear and organized ideas
No errors/Minor errors
Demonstrates mastery of structure (few
errors)
: Effective/appropriate word choice
: No hesitation
: Well organized and clear ideas

:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

Other experts suggest slightly different scoring systems. HD Brown (2003: 166)
suggests scoring guide of spoken English as follows:

Rating Scale/
Category
60

50

40

30

20

No.

Description of Criteria
Communication almost always effective, task performed very
competently, speech almost never marked by non-native
characteristics
Communication generally effective task performed competently,
successful use of compensatory strategies, speech sometimes
marked by non-native characteristics
Communication somewhat effective task performed somewhat
competently, some successful use of compensatory strategies,
speech regularly marked by non-native characteristics
Communication generally not effective, task generally performed
poorly, ineffective use of compensatory strategies, speech very
frequently marked by non-native characteristics
No effective communication, no evidence of ability to perform task, no
effective use of compensatory strategies, speech almost always
marked by non-native characteristics

For language components, HD Brown (2003: 172-173) suggests six items, i.e.
grammar, vocabulary, comprehension, fluency, pronunciation, and task. For other
alternative is scoring from 0, 0+, 1, 1+, 2, 2+, 3, 3+, 4, 4+, and 5. And the leveling systems

46

Rubrics on Scoring English Tests for Four Language Skills (Pandiya)

can be superior, advanced, intermediate, and novice (Brown, 2003: 174-177). This, of
course, refers to further complex speaking test or oral proficiency/oral production.
SCORING SYSTEM OF WRITING TESTS
Similar to scoring system of speaking tests, there are also two scoring systems of
writing tests, i.e. analytic system and holistic system. The first is scoring the learners
writing ability by separating the components of writing skill into sub skills, and the rater
scores each component, and then sums the sub scores into final score. The latter is
scoring/judging the learners writing ability on the basis of the raters general impression
on the learners performance without necessarily separating the writing components.
Thus, the rater directly comes to a single score without totaling the sub scores such that in
the analytic system.
HD Brown (2003: 219) describes any kind of genres of writing as follows: (1)
Academic writing such as papers and general subject reports, essays, compositions,
academically focused journals, short answer test responses, technical reports (e.g. lab
reports), theses, and dissertation, (2) Job related writing such as messages (e.g. phone
messages), letters/emails, memos (e.g. interoffice), reports (e.g. job evaluation, project
reports), schedules, labels, signs, advertisements, and announcements, (3) Personal
writing such as letters, emails, greeting cards, invitations, messages, notes, calendar
entries, shopping lists, reminders, financial documents (e.g. checks, tax forms, loan
application), forms, questionnaires, medical reports, immigration documents, diaries,
personal journals, fiction (e.g. short stories, poetry).
The three above aspects will also determine the scoring systems of writing ability.
HD Brown (2003: 239) describes rating scale of writing ability by holistic system as
follows:
Rating Scale/
Category
6
5
4
3
2
1
0

Description of Criteria
Demonstrates clear competence in writing on both the rhetorical and syntactic
levels, though it may have occasional errors
Demonstrates competence in writing on both the rhetorical and syntactic
levels, though it will probably have occasional errors
Demonstrates minimal competence in writing on both the rhetorical and
syntactic levels
Demonstrates some developing competence in writing, but it remains flawed
on either the rhetorical or syntactic level, or both
Suggests incompetence in writing
Demonstrates incompetence in writing
Contains no response, merely copies the topic, is off the topic, is written in
foreign language, or consists only of keystroke characters

For classroom evaluation learning is best server through analytic scoring in which as
many as six major (or five) elements of writing are scored, thus enabling learners to home
in on weaknesses and to capitalize on strengths (Brown, 2003: 243). The six major
elements of writing, then cover organization, logical development of ideas, grammar,
punctuation/spelling/mechanics, and style, and quality of expression, whereas the five
major elements cover content, organization, vocabulary, syntax, and mechanics.
Analytic scale for rating composition tasks suggested by Brown and Bailey in HD
Brown (2003: 244-245) covers some points as follows:

Ragam Jurnal Pengembangan Humaniora Vol. 13 No. 1, April 2013

47

No.
1

Elements of Writing
Organization,
Introduction, Body, and
Conclusion

Category/Rating Scale
20 18, 17 15,
14 12, 11 6, 5 1

Logical of Development
of ideas, Content

20 18, 17 15,
14 12, 11 6, 5 1

Grammar

20 18, 17 15,
14 12, 11 6, 5 1

Punctuation, Spelling,
and Mechanics

20 18, 17 15,
14 12, 11 6, 5 1

Style and Quality of


Expression

20 18, 17 15,
14 12, 11 6, 5 1

Description
Excellent to Good, Good to
Adequate, Adequate to Fair,
Unacceptable, Not college level
work
Excellent to Good, Good to
Adequate, Adequate to Fair,
Unacceptable, Not college level
work
Excellent to Good, Good to
Adequate, Adequate to Fair,
Unacceptable, Not college level
work
Excellent to Good, Good to
Adequate, Adequate to Fair,
Unacceptable, Not college level
work
Excellent to Good, Good to
Adequate, Adequate to Fair,
Unacceptable, Not college level
work

SCORING SYSTEM OF READING TESTS AND LISTENING TESTS


Reading skill and listening skill are different from speaking skill and writing skill. The
first is receptive skills, and the latter is productive skills, and or other alternative names,
i.e. the first is receiving the message, while the latter is sending/giving the message. It
seems that the first is also passive actions, and the latter is active actions, though both
skills actually need serious and full energy effort.
When referring to genres of reading, then people will cover to some points as
follows: (1) Academic reading, such as general interest articles (in magazines,
newspapers, etc.), technical reports (e.g. lab reports), professional journal articles,
reference material (dictionaries, etc.), textbooks, theses, essays, papers, test directions,
editorials, and opinion writing, (2) Job-related reading, such as messages (e.g. phone
messages), letters/emails, memos (e.g. interoffice), reports (e.g. job evaluations, project
reports), schedules, labels, signs, announcements, forms, applications, questionnaires,
financial documents (bills, invoices, etc.), directories (telephone, office, etc.), manuals,
and directions, (3) Personal reading, such as newspapers and magazines, letters, emails,
greeting cards, invitations, messages, notes, lists, schedules (train, bus, plane, etc.),
recipes, menus, maps, calendars, advertisements (commercials, want ads), novels, short
stories, jokes, drama, poetry, financial documents (e.g. checks, tax forms, loan
application), forms, questionnaires, medical reports, immigration documents, comics
strips, cartoons (Brown, 2003: 186-187).
For scoring scale of reading test, there are two alternatives, i.e. subjective test, and
objective test. Scoring system of objective test is clear, just 1 for correct answer, and 0 for
wrong answer. For scoring system of subjective test varies from the point of view of
elements of reading, such as grammar, vocabulary, graphology (writing rules/styles), and
contents. The variety of scoring scale can be based on the levels of cognitive domain,
whether it is knowledge, comprehension, application, analysis, synthesis, or evaluation.
This also refers to whether it is micro-skills of reading comprehension, or macro-skills of
reading comprehension.
For listening tests, scoring system is similar to reading tests, i.e. from the simplest
level to the most complex one, or whether it is micro-skills of listening or macro-skills of

48

Rubrics on Scoring English Tests for Four Language Skills (Pandiya)

listening. This started from discriminating among the distinctive sounds of English, and
then comes to developing and using a battery of listening strategies, such as detecting
key words, guessing the meaning of words from context, appealing for help, and signaling
comprehension or lack thereof. This also refers to scoring system by using objective tests
or subjective tests. Even for elements of listening are similar to reading, i.e. grammar,
vocabulary, contents, phonology aspects (intonation, stress, and or tone). The difference
is on graphology (writing rules/styles) for reading, and phonology aspects for listening.
The scoring scale of objective test is just 1 for correct answer, and 0 for wrong answer.
For subjective tests, the variety lies on the kinds of elements of listening, and also the
levels of cognitive domains, whether it is knowledge, comprehension, application,
analysis, synthesis, and evaluation.
CONCLUSIONS
The best and fair scoring system of the test of four language skills lies on the
teachers policy, institutions, and all authorized persons involved in education/instruction
program. Thus they must collaborate and work together for creating qualified education/
instruction program, especially on evaluation/assessment activities.
REFERENCES
Bloom. 2004. Blooms Taxonomys Model Questions and Key Words. The UT Learning
Center, The University of Texas at Austin.
Brown, HD. 2003. Language Assessment: Principles and Classroom Practices. San
Francisco State University, California.
Danis, James R. 1976. Teaching Strategies for College Classroom. Westview Press.
Harmer, Jeremy. 2007. How to Teach English, An Introduction to the Practice of English
Language Teaching. England: Pearson Education Limited.
Mukminatien, Nur. 2000. The Advantages of Using an Analytic Scoring Procedure in
Speaking Assessment, TEFLIN Journal, Vol. XI, No. 1 Agustus 2000. Universitas
Negeri Malang.
Richards, Jack C and Theodore S Rodgers. 1992. Approaches and Methods in Language
Teaching: A description and analysis. Cambridge: Cambridge University Press.
Saleh, Mursid. 2008. Enam Tradisi Besar Penelitian Pendidikan Bahasa. Program Pascasarjana Universitas Negeri Semarang.

Ragam Jurnal Pengembangan Humaniora Vol. 13 No. 1, April 2013

49

You might also like