Usage Data and Group Rankings in Peer Review Settings A Case Study On Students' Behavior and Performance PDF

2014 IEEE 14th International Conference on Advanced Learning Technologies
Usage Data and Group Rankings In Peer Review Settings

A Case Study on Students’ Behavior and Performance
Pantelis M. Papadopoulos Thomas D. Lagkas

Aarhus University The University of Sheffield
Aarhus, Denmark International Faculty, CITY College, Greece
pmpapad@tdm.au.dk T.Lagkas@sheffield.ac.uk
Abstract—This paper focuses on the effect that usage data and Our research scope is to enhance the learning experience
group ranking information may have when are presented to for students in peer review settings, focusing mainly on the
students in a peer review setting. The study analyzes the assigning phase, allowing students to freely select the peer
performance and attitudes of 56 sophomore students enrolled work they want to analyze and review. We call this Free-
in a Network Planning and Design course. The students, Selection review protocol and it has been in the center of
grouped randomly in three different conditions, followed the attention in a series of studies. Results showed that students
prescriptions of a free-selection peer review protocol that that acted both as authors and reviewers in a free-selection
guided students in double-blind review process, allowing them review setting outperformed students that assumed the same
to select on their own peer work for review. Students in the
roles in teacher-defined pairs [8]. In addition, result analysis
first condition acted as the control group without any
information on their usage data or rankings. Students in the
of a recent study on the way peer comments and self-
second condition had access to usage data information reflection affect students revealed that intrinsic feedback
mirroring their activity in the study, while students in the third may have a stronger effect than extrinsic feedback
condition received additional information on their rankings (reviewers’ comments) [9]. We refer to intrinsic feedback as
inside their group. Result analysis suggests that, while there is the deeper understanding constructed by a reviewer on the
not difference in domain knowledge acquisition, engagement quality of her work, through critical reflection on peers’
was higher in the third group, with students spending more contribution.
time in the activity and expressing a more positive attitude In this paper, we focus on ways to enhance students’
towards the study. engagement in peer review, by providing usage and ranking
information. Usage information refers to the effort put by the
Keywords-peer review, free-selection, Computer Science students in the activity, as evident by the number of logins in
Education, engagement and motivation the learning environment used, the number of peer
submissions read, and the number of reviews submitted.
I. INTRODUCTION Ranking information provides additional value to these data,
by also informing students on their relative position in the
Peer review is a widely used instructional method that
group for each of the usage metrics. The study examined
aims to assist students in acquiring domain knowledge, while
whether this additional information would trigger additional
also developing domain-independent skills in reviewing.
motives to the students, or whether ranking information
Typically, a peer review process includes four phases: (a)
could have a detrimental effect on students’ behavior.
production of initial student work, (b) assigning authors’
work to reviewers, (c) providing feedback back to authors in II. METHOD
their work, and (d) revising initial work according to review
comments and suggestions. McConnell [1] argues that peer A. Participants
reviewing offers to students the opportunity for a
constructive and collaborative learning experience, by The study employed 56 volunteering sophomore students
engaging them in an active learning exercise. Peer review is (32m, 24f) majoring in Informatics and Telecommunications
primarily expected to support higher-level learning skills Engineering in a 5-year study program. We awarded a bonus
such as synthesis, analysis, and evaluation [2] as the students grade for the lab section of the course to each student that
have the opportunity to analyze and evaluate peer work. successfully completed all the phases of the activity. We
Scardamalia and Bereiter [3] have provided evidence that randomly assigned student population into three study
higher cognitive processes of learning are stimulated and conditions: Control: 18 students (11m, 7f); Usage: 20
guided by the peer review procedure, by implementing the students (11m, 9f); Ranking: 18 students (10m, 8f). All
method into school classes. The method has been used students were enrolled in the “Network Planning and
extensively in various fields (e.g., [4][5]), including Design” (NP&D) course, teaching students how to analyze
Computer Science (e.g., [6][7], the primary domain of our clients’ needs, identify system specifications, and design an
work. appropriate computer network.
978-1-4799-4038-7/14 $31.00 © 2014 IEEE 297

DOI 10.1109/ICALT.2014.92
This paper is available by IEEE Xplore at:
http://dx.doi.org/10.1109/ICALT.2014.92
B. Material the comments received from their peers. In case a student did
1) The learning environment: We designed and not receive peer comments for her answers, the system was
developed the learning environment we used in this study to asking her to perform self-review before revising the initial
be able to taylor more ealisy the study conditions to our answer. The self-review form was asking students to analyze
research needs. The learning material was presented as their own answers having in mind also the networks other
realistic open-ended scenarios (acting as the problems students proposed. During the Revise phase, one more metric
students had to solve) accompanied by a list of supporting was available to Usage and Ranking groups, the average
material presented as past cases (acting as the resources to grade received from peers, along with the number of
solve the problems). The scenarios referred to various received reviews. After the Review and Revise phases, the
installations of computer network systems in students took a written post-test in class, followed by the
new/restructured facilities, while the supporting material attitudes questionnaire that concluded the activity.
referred to similar past projects highlighting important III. RESULTS
domain factors, such as the cost of the project, efficiency
requirements, expansion requirements, and the traffic type Students’ answers in the written tests (pre- and post-test)
and end-users’ profile. Students had to take into account the and the learning environment were independently assessed
specific conditions and the context presented and propose by two raters that followed predefined grading instructions.
their own computer networks, as answers to 3 scenarios. Table I presents students’ scores in pre- and post-test and
2) Pre-test and post-test: The pre-test instrument was in the learning environment (initial/revised answers).
comprised by 6 open-ended questions (e.g., “How can the Students scored very low in the pre-test instrument,
security requirements of a network affect its architecture?”) providing evidence that were, indeed, novices in the domain.
measuring students’ prior knowledge of the domain. One-way analysis of variance results showed that the 3
Similarly, the post-test instrument was comprised by 3 groups were comparable both in pre- and post-test, and in the
open-ended questions (e.g., “Which network characteristics initial and revised answers (p>.05)
are affected by the end-users’ profile?”) used to record the
TABLE I. STUDENTS’ PERFORMANCE
level of domain knolwdge students acquired in the study.
3) Attitude questionnaire: An online questionnaire was Control Usage Ranking
used after the study to record students’ opinions and (0-10) M SD n M SD n M SD n
attitudes towards different aspects of the activity. Pre-test 2.11 (0.98) 18 1.98 (1.01) 20 2.02 (1.09) 18
Post-test 8.17 (1.01) 18 8.10 (0.96) 20 8.11 (0.93) 18
C. Procedure (1-5)
The whole activity lasted 2 weeks, starting with the Pre- Initial 2.94 (0.58) 18 3.12 (0.75) 20 2.80 (0.77) 18
test phase and the written test in the classroom. During the Revised 3.44 (0.67) 18 3.64 (0.58) 20 3.49 (0.77) 18
Study phase that started right after and lasted 1 week,
students logged in the environment (from wherever and The lack of significant difference for the initial answers
whenever they wanted) and worked on the 3 available was expected, since study conditions were the same up to
scenarios proposing their own computer networks. that point for everyone. However, the lack of statistical
Next, the students entered the Review phase that lasted 4 difference in the scores of the revised answers suggests that
days. During this time, students gained access to all the peer the instructional intervention applied did not have an effect
work in their groups and they had to review, in a double- on students’ domain knowledge acquisition. Despite the
blind process, at least 1 peer submission per scenario. latter, paired-samples t-test results showed that scores in the
Students were free to select the submissions they wanted. revised answers increased significantly (p<.05) for all the
Along with the comments, the reviewer had to also suggest a groups. In addition, no statistically significant difference
grade according to the following scale: (1: Rejected/Wrong between the groups was found on the usage metrics
answer; 2: Major revisions needed; 3: Minor revisions performance (logins, reads, reviews, and grades).
needed; 4: Acceptable answer; 5: Very good answer). Table II presents students’ responses in the most
During the Review phase, each student in the Usage important items of the attitudes questionnaire.
group was able to see (a) the number of logins, (b) the When it comes to hours spent in the learning
number of answers read, and (c) the number of answers environment, we asked the students to keep track of time
reviewed. These 3 metrics did not provide any additional throughout the activity and report this time in the
information to the students, since a student could also find questionnaire. According to students’ responses, the 3 groups
out the respective values by taking notes of her activity. On were comparable (p>.05) on the time spent during the first
the contrary, a student in the Ranking group was able to see, week (Q4). On the contrary, pairwise analysis showed that
in addition to the previous information, her rankings and the students in the Ranking group spent significantly more time
average values in her group. This was something that the (p<.05) in the learning environment during the second week
students could not have measured themselves. Chromatic (Q5). The big drop between the first and the second week
code was used to denote the quartile according to the ranking was expected. During the first week, students have to read a
(1st: green; 2nd: yellow; 3rd: orange; 4th: red). lot of material and answer the scenario questions. On the
In the Revise phase that followed and lasted 3 days, the contrary, during the second week, students’ workload is
students were able to revise their own answers, according to
298
significantly lower, since peers’ answers are shorter, while As we mentioned, there were students that reacted to low
revising a scenario answer also takes less time than writing ranking and tried to better their positions by reading and
an initial one. reviewing more peer answers. On the other extreme, there
were students that performed the bare minimum required by
TABLE II. QUESTIONNAIRE RESPONSES the activity. Rankings for those students remained low
Control Usage Ranking throughout the activity. Between those two attitudes there
(n=18) (n=20) (n=18) were students that tried only partially (just in the beginning,
M (SD) M (SD) M (SD) or in a specific metric) to improve their rankings. For
Q1. Would you like to know the identity of your peers that example, students stated that they usually got alarmed by a
reviewed your answers? (1: No; 5:Yes) low ranking in the Grade metric that appeared in the
1.70 (0.80) 1.95 (1.16) 1.84 (1.30) Revision phase. This was the only metric that students could
Q2. Would you characterize the usage information presented not change, as it was depicting peers’ opinions on the
during the second week as useful or useless? (1: Useless; 5: Useful) student’s answers in the previous Review phase. However, if
n.a. 4.33 (1.06) 4.26 (0.81) the Grade ranking was high, little attention was given to the
rest of the metrics. This may also be problematic. Students
Q3. Would you characterize the ranking information presented
during the second week as useful or useless? (1: Useless; 5: Useful) clearly stated that knowing that their Grade metric was high
in ranking was reassuring. However, high in rankings may
n.a. n.a. 3.53 (1.07)
not always mean a correct answer. For example, since the
Q4. How many hours did you spend approximately in the activity score of the Revised answers was significantly higher, a
during the first week (Study phase)? (in hours)
student that decided not to revise her answers, based solely
10.08 (3.09) 10.12 (3.80) 9.59 (3.94) on the ranking her Initial answer received, may be misled.
Q5. How many hours did you spend approximately in the activity In conclusion, providing students with usage and ranking
during the second week (Review and Revision phases)? (in hours) information is one more type of feedback. Showing this
2.48 (1.40) 2.33 (1.01) 3.31 (1.13)* information could provide an additional motive for deeper
*. p<.05 engagement to some students. However, students appropriate
this information according to their intrinsic motives. Highly
IV. DISCUSSION engaged students may identify the areas they need to
The usage metrics we presented covered several aspects improve, while less engaged students would either ignore the
of the activity. By presenting them to the students we are additional data, or, worse, find an excuse to disengage even
also making a statement on what is important in a setting like further.
this. Result analysis suggests that presenting the usage data
alone may create a positive attitude, but this is not adequate REFERENCES
to affect students’ behavior and performance in the activity. [1] McConnell, J. (2001). Active and cooperative learning. Analysis of
Students of both related groups appreciated having an Algorithms: An Active Learning Approach. Jones & Bartlett Pub
organized view of the metrics of their activity, but this did [2] Anderson, L. W., & Krathwohl, D. R. (Eds.) (2001). A Taxonomy for
not result in higher levels of knowledge acquisition, either in Learning, Teaching, and Assessing: A Revision of Bloom's
Taxonomy of Educational Objectives. NY: Longman
the learning environment or the post-test.
[3] Scardamalia, M., & Bereiter, C. (1994). Computer support for
Usage data was something that students could estimate knowledge-building communities. The Journal of the Learning
themselves. For example, a student could count the total Sciences, 3(3), 265-283.
number of reviews she submitted, by going though each [4] Dossin, M. M. (2003). Among Friends: Effective Peer Critiquing. The
scenario. As such, providing just a value without a reference Clearing House, 76 (4), 206-207.
point was “nice” but rather unimportant. On the contrary, [5] Goldin, I. M., & Ashley, K. D., (2011). Peering Inside Peer-Review
students’ behavior was affected when this reference point with Bayesian Models. In G. Biswas et al. (Eds.): AIED 2011, LNAI
was presented. Students in the Ranking group were able to 6738, pp. 90–97. Springer-Verlag: Berlin.
compare themselves with the group average. This provided [6] Turner, S., Pérez-Quiñones, M.A., Edwards, S., & Chase, J (2010).
the motive to some of the students in the group to actively try Peer-Review in CS2: Conceptual Learning. Proceedings of
SIGCSE’10, March 10–13, 2010, Milwaukee, Wisconsin, USA
improving their positions. We argue that this was the main
[7] Liou, H. C., & Peng, Z. Y. (2009). Training effects on computer-
reason behind the statistical difference found in the time mediated peer review. System, 37, 514–525
spent during the second week. Nevertheless, this attitude [8] Papadopoulos, P. M., Lagkas, T. D., & Demetriadis, S. N., (2012).
towards the ranking information was not universal in the How to Improve the Peer Review Method: Free-Selection vs
group, as many students chose to ignore it. We had to Assigned-Pair Protocol Evaluated in a Computer Networking Course.
perform a case-by-case analysis to get a better picture of how Computers & Education, 59, 182 – 195. Elsevier.
and whether rankings affected students. This deeper analysis [9] Papadopoulos, P. M., Lagkas, T. D., & Demetriadis, S. N., The
revealed that there was a wide range of different approaches Impact of Instrinsic and Extrinsic Feedback in Free-Selection Peer
Review Settings (submitted).
followed in the Ranking group.
299
View publication stats

Usage Data and Group Rankings in Peer Review Settings A Case Study On Students' Behavior and Performance PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Usage Data and Group Rankings in Peer Review Settings A Case Study On Students' Behavior and Performance PDF

Uploaded by

Copyright:

Available Formats

2014 IEEE 14th International Conference on Advanced Learning Technologies

Usage Data and Group Rankings In Peer Review Settings

Pantelis M. Papadopoulos Thomas D. Lagkas

978-1-4799-4038-7/14 $31.00 © 2014 IEEE 297

View publication stats

You might also like