You are on page 1of 5

Learning SQL with a Computerized Tutor

Antonija Mitrovic
Computer Science Depaitment, University of Canterbury
Chriskhurch, New- Zealand
tanja@cosc.canterbury.ac.nz

Abstract Why SQL?

SQL, the dominant databaselanguage, is a simple and SQL is a completedatabaselanguage;it contains data and
highly structured language; yet, students have many view definition statements,as well as data manipulation
difficulties learning it. This paper presentsSQL-Tutor, an statements.The entry level of SQL2 standardfrom 1992 is
Intelligent Teaching Systemdesignedasa guided discovery supportedby mostDBMS vendorstoday. Although there is
learning environmept,which helps studentsin overcoming a movementtowards graphical query interfaces, SQL is an
thesedifficulties. We present design issuesand the current extremely important developmentin the databaseworld. It
state in the implementation of the system, with special will be, used for years to come either for interactive or
focus on individualization of instruction towards a programmed access to databases (embedded in other
particular student. programming languages. and tools for application
development). SQL3, the latest standard scheduled to
Introduction _ _ ~pp.~ in 1998,-islikely to gain even more importance for
SQL, with the introduction of support for knowledge-based
Thk author has beenteaching SQL since 1991,as a part bf and 00 -applications and distributed databases,among
a Database Management course. The course is usually other-newfeatures.
taken by abotit 50 higher-level undergraduatestu$&ts,.and Despite the simplicity and highly structured nature of
coversvarious topics, suchas datamodels,databasedesign; SQL, studentshavemany problems learning it. Someerrors
relational query languages (SQL, relational algebra and in students’ queries come from the burden of having to
calculus), query processing, normalization theory, memorize databaseschemas;incorrect solutions may, for
transactionprocessingand distributed daebases. example, contain incorrect table or attribute names. Other
SQL is a simple and highly structured language; yet, errors come from misconceptions in student’s
studentshave many difficulties learning it. In this paperwe understanding of the elements of SQL and the relational
present SQL-Tutor, an Intelligent Teaching System (ITS) data model in general. Someof the conceptsstudentsfind
developedfor guided discovery learning of SQL. As other particularly difficult to grasp are grouping and restricting
ITSs, SQL-Tutor focuses on the individualization of grouping. Join conditions and the difference between
instructional sessions towards a particular student, by aggregateand scalar functions are another two common
developing a model of student’s knowledge, learning sources of confusion. Other reseaizhers report the same
abilities and general characteristics, and tailoring studentmisconceptions[6].
instructional actionsto student’sneeds. SQL is usually taught in the classroom, by solving
The rest of the paper is organized as follows. hi the next problems on the blackboard, complemented by lab
sectionwe look at the typical problemsstudentshave when exercises.However, studentsfind that is not easy to learn
learning SQL and then ‘discusssomerelated approachesto SQL directly by working with a DBMS, as the error!
supporting SQL learning ,in section 3. Section 4 presents messagesare limited to the syntax only. Figure 1 illustrates
the architecture pf SQL-Tutor, and briefly surveys’ its a situation in which example 1 requires the student to
components.Learning support provided by SQL-Tutor is specify a query with five clauses,as shown in the correct
the theme of section 5. Finally, section 6 gives die&ions solution. When the studententershis/her incorrect solution,
for fitore researchand our plans for SQL-Tutor. typically the error messagegeneratedby a RDBMS (Ingres
in this case) will not be of much help. The same figure
Permissionto makedigitalhard copiesof all or part of &is mate&i for illustrates the kind of messages’the student may obtain
personalor classroomuseis grantedwithout fee providedthat the copies
areoot madeor distributedfor profit or commercialadvantage,the copy- _
from SQL-Tutor. Note that SQL-Tutor can give feedback
right notice,the title of the publicationandits dateappear,andnoticeis on semanticerrors aswell (such as specifying two tables in
given that copyrightis by permissionof the ACM, Inc. To copy otherwise, , FROM where only the MOVIE table is needed).
to republish,to poston serversor to rediibute to lii requiresspecific
permissionandlorfee.
SIGSCE98 AtkintaGA USA e ’ Notethatthestudentis givenonly onemessage at a time,as
Copyright
19980-89791-994-7I98I2.S.00 governedby thkpedagogical rules.Herewe showall relevant
messages for illustration.

307
Example 1: For eachdirector, list the director’snumberand the total number of awardswon by comedieshe/she directed
if that number is greaterthan 1.

Correct solution: Student’s solution:


SELECT DIRECTOR,SUM(AAWON) SELECT DIRECTOR,SUM@AWON) '
FROM MOVIE FROM DIRECTOR JOIN MOVIE
WHERETYPE='comedy ON DIRECTOR=DIRECTOR.NUMBER
.
GROUPBY DIRECTOR WHERETYPE='comedy'
HAVING SUM(AAWON)> 1

INGRES:.E_USOB63 line 1, The columns in the SELECT clause must be contained in


the GROUPBY clause.
. , "
SQL-Tutor: ’ I
l You need to speciJL the GROUP BY clause! The problem requires summary information.
. Spectfy the HAJ%VG clauseas well! Not allgroupsproduced by the GROUP BYclause are relevantfor this problem.
l You do not need all thetables you specified!
0’ Ifthere are aggregate functions in the SELWT clause, and the GROUP BY clause is empty, then SELECT must consists
of aggregate functions only. / ‘_ ,I
. For-everytable’that appears-in the FROMclause, there must be at least one attributefrom that table used in any clause
II I{‘ii
of the query.
/, ,I :.&
I i ., f
,;‘i :, &&e 1, $ia&quacy of feedback from a RDBMS :
I
notice the error made. However, SQL-Tutor provides an

I
/
Example 2: .List ,thenarhesof,all directors born in or after appropriatemessage.
1920. ” ,,a,, I
,” ‘, ’
Related Work 1
Correct sollion: ,-- -. 0 !I
SELECT LNAME, <FNAME There have been several systems that support learning
FROM DIRECTOR I -! relational query languages.WinRDBI [3] enablesstudents
WHEREBORN ,>= 1920 ,,I : to specify queries in relational algebra, tuple or domain
IL relational calculus, or SQL. Queries can be executed and
Student’s solution: I’, students can inspect resulting tables. The system also
SELECT LNAME, FNAME allows studentsto inspect definitions of tables, createnew
FROM DIRECTOR i ,: databases,alter or update existing databasesand store
WHEREDIqD >= 1920 !. definitions of queries.
The esql system[6] supportslearning SQL by visualizing
Tngres:
lnqqe fnamk the stagesin query processing.The student seesesql as a
Hitchco& 'Alfrbd graphical query interface. Once the studenthas specified a
De' Mille Cecil SQL statement,the systemprovides a step-by-stepdisplay
of how the resulting table is formed.
Fbrd, ', ,.' !,," ,t',John
"
Both systems provide more’ information on SQL than
SQL-Tuto;: ’ ’ commercial DBMSs and better user interfaces. However,
Check that you are comparing the numerical constant tothe they suffer from the sameproblem as DBMSs: they cam@
tight attribute in the WERE clause! ’ ’ provide feedbackbasedon the student’ssolution, due to the
. . lack of knowledge neededto reasonabout the semanticsof
:,
the problem being solved.
Figure 2. Inability of a’ RDBMS to deal with semantic
errors b ‘_I SQL-Tutor
, ’ * I_ .,
Figure 2 illustrates a’situation of a semanticerror. Instead It is well known that one-onione human tutoring is much
of using the BORN attribute, the students specified the more effective than traditional classroom instruction [2],
search condition on the DIED attribute, and the ‘DBMS ’ The goal of researchin ITS is to build computerizedtutors
produced the result ,showh. The student.may not even that achieve the effects of learning individually with a

308
human tutor. ITSs contain domain knowledge, which the clausesof the SQL SELECT statement,thus visualizing
enablesthem to selectproblemsto be posedto students,to the goal structure. Studentsneed not rememberthe exact
diagnoze student’ssolution and/or to solve the problems. keywordsusedand the relative order of clauses.The lowest
Furthermore,such systemsalso contain knowledge of their part displays the schemaof the currently chosen database.
students, represented in the form of student models. The schema name is given first, followed by the
Pedagogical knowledge is necessaryin ITSs in order to descriptionsof tables.Each table is shown by its name and
generate appropriate pedagogical actions (such as schemaenclosedin a box. The name(s) of the attribute(s)
feedback). Finally, these systems also require forming the primary key is underlined and given in blue.
communication knowledge, in order to communicate The foreign key attributes are given in red. In such ways,
effectively with students. the interface of SQL-Tutor supportsthe reification of goal
SQL-Tutor is an ITS for SQL programming,implemented structure and reduces the working-memory load of
in CLOS [5] on SUN workstations.It will soonbe ported to students.
PC compatibles. Many dialects of SQL exist, since The visualization of schemas is quite important; all
databasevendors do not follow the standards.SQL-Tutor databaseusers are painfully aware of the constantneed to
hasbeentailored to SQL as implementedby Ingres. remembertable and attribute namesand the corresponding
SQL-Tutor is designed as a practice environment; we semanticsas well. Studentscan ask for the description of
supposethat studentshave previously been exposedto the databases,tables or attributes by selecting appropriate
concepts of databasemanagementin lectures. Therefore, options from the Help menu, or by directly selecting
the systemis not a substitute for the conventional style of table/attribute names. Furthermore, users can learn about
education, but a complement to it. The system currently elements of SQL, such as functions, expressions,
covers only the SELECT statement of SQL, but the same predicates, operators and others, by selecting appropriate
approach could be used with other SQL statements.This options in the Help menu.The motivation here is to remove
focus on the SELECT statement does not reduce the from the student some of the cognitive load required for
importance of the system,becausequeries causethe most checking the low-level syntax and to enablethe studentto
misconceptions for students. Moreover, many of the focus on higher-level query definition problems. Students
conceptscoveredby SELECT are directly relevant to other can also obtain the descriptions of various clauses by
SQL statementsand other relational databaselanguagesin selectingthe appropriateclauseor by asking for help from
general. the main menu The Open menu allows for selection of a
databaseor a problem to work on.
student CBM The pedagogicalmodule (PM) is the heart of the system;it
models selects problems to be given to students and generates
appropriate instructional actions according to the student
model. PM observesevery student’saction performedin the
I interface, and reactsto it appropriately.At the beginning of
Pedagogical the interaction, a problem must be selectedfor the student
module to work on. When the student enters the solution for the
current problem, PM sendsit to the studentmodeler,which
T checks whether the solution is correct or incorrect, and

9
updatesthe student model. The pedagogical module then
Interface generatesappropriatefeedback.When the current problem
is solved, or the studentrequiresa new problem to work on,
the pedagogicalmodule selectsa new problem on the basis
of the studentmodel. The system contains definitions of
Student
several databases,which are also implemented on the
Figure 3. Architecture of SQL,-Tutor RDBMS used in the lab (currently Ingres). New databases
As illustrated in figure 3, SQL-Tutor has a very simple can easily be added,by supplying the sameSQL files used
architecture; it consists of a user interface, pedagogical to createthe databasein Ingres.
module and a student modeler. The interface is illustrated SQL.-Tutor also contains a set of problems for specified
in figure 4. The interface is a mediating device and henceit databasesand the ideal solutions to them. The solutions are
provides information about the system itself The main necessarybecausethe systemhasno domain module and is
window of SQL-Tutor is divided into three areasthat are not capableof solving problems. The rationale for such a
always visible to the student The upper part of the window departure from the typical architecture of an ITS, which
displays the text of the problem being solved and the also includes a domain module, follows. Designing an ITS
student can always remind him/herself easily of the to teach SQL presentsvarious difficulties. Databasequeries
elementsrequestedin the query. The middle part contains

309

, ..r--
J~2-,“^‘.; ,,, _ -,,...,. . . . r-...,.. *-- .,,. -. . I- -..._,. _ -.-

J .I i
2 * ', Figure 4. The Interfaceof SQL-Tutor

are given in a natural language;however,the current state- semanticsof problems,by comparing students’solutions to
of-the-art in Natural LanguageProcessing(NLP) is still far the ideal (correct) ones.That is the reasonfor SQGTutor to
from being able of handling various problems present in require ideal solutions to problems.
queries, such as references and synonyms. There is a Constraintsthat comparethe student’ssolution to the ideal
possibility to circumscribethe I&P problem: the text of the one are more complex. For example,constraint 186 applies
problem may be representednot in its natural-language to situationswhere the WHERE clauseof the ideal solution
.form, but in a’form which could be the product of NLP, as contains (at least one) condition which checkswhether the
done in [l]. However, it is hard not to build parts of the value of a numeric attribute is greater than some numeric
solution into such a representation1,IFurthermore,even if constant and the same attribute appears in the student’s
we overlook the NLP problem, the knowledge required to solution in a condition with greater-than-or-equalinsteadof
,write, SQL queries isvery fuzzy. Therefore, it, would !bk the grater-than operator. If that is the case,the constraint
,very difficult,, if not entirely impossible, to develop a ensuresthat the constantin the student’ssolution is 1 less
problem solver inthis area. ’ .,, than the constantin the ideal solution.
SQL-Tutor is basedon Constraint-BasedModeling (CBM) The constraintbaseof SQL-Tutor currently consists of 199
.[8], a student modeling approach that focuseson student constraints, which are acquired by analyzing the domain
errors. For further details. of CBM and how it is knowledge [4,9] and on the basis of a comparativeanalysis
implementedin SQL-Tutor, see [7]. Domain knowledge is of correct and incorrect solutions. It is well known that
representedin CBM in a descriptive form, as constraints, knowledge acquisition is a very slow, time-consuming and
and is used to identify the errors. Constraints divide all labour-intensive nrocess.Anderson [l] reports 10 or more
possible problem statesinto equivalenceclasses.All states hours necessaryior induction of a production rule. When
in a single classare deemedto be pedagogically equivalent interviewing domain expertsin order to acquire knowledge
‘in that they generatethe sameinstructional action. _ : for expert systems, usually 2 to 5 production rules
SQL-Tutor evaluatesstudents’solutions by matching them ,equivalents are identified per day. The time spent on
to constraints:Someconstraintsdeal with the syntax of the identification, implementation and testing of SQL-Tutor
&urguage;for example,there is a constraint saying that the constraints averagesat 1.3 hours per constraint, which is
SELECT clauses of all solutions must not be empty. significantly shorter than times above. This may be the
Another example of a syntactic constraint checksthat if a consequenceof the same person serving as the domain
student’s solution contains aggregate functions in the expert and knowledge engineer (and the systemdeveloper,
SELECT clauseand the GROUP BY clauseis empty, then at that matter), but may also illustrate the appropriateness
the only kind of expressionsallowed in the SELECT clause of the chosenformalism.
are aggregate functions. Other constraints deal with
310
A student model contains general information about the on the basis of error messagesand correct solutions. We
student (his/her name and the level of knowledge), a plan to elaboratethe pedagogicalactions that will provide
history of previously solved problems, and information more emphasison self-explanation,and also to incorporate
about the usage of constrains, as mancfestedin student’s other forms of meta-learning,such asusing analogies.
solutions.
Conclusions
Learning in SQL-Tutor
This paper presented the current state in the
The main goal of ITSs is the individualization of implement&on of SQL-Tutor. The systemhas been shown
instruction. In SQL-Tutor, insection can be individualized to a number of database teachers, who were very
in several ways, by generating feedbackdyn&nically and supportive and expressedgreat ‘enthusiasmfor using it in
selecting topics and problems, on the basis of the student their. own courses. We plan the system to be ready for
model. /
classroomuse in early 1998.
The level of feedbackdetermineshow much information is Before the systemcan bp evaluated,th&e are severalshort-
provided to the student. Currently, there are five levels of term goals, such as further sophistication of the interface
feedback in the system: positive/negative feedback, error and completion of the constraint base.In order to provide a
flag, hint, partial solution and complete solution, arranged more realistic working environment, we plan to connect
in the increasingorder of the amount of information. At the SQL-Tutor to a DBMS. In such a way, the student may
lowest level (positive/negative feedback), the message inspecttablesor query results.
simply informs the student whether the solution is correct We believe that SQL-Tutor will prove to be invaluable due
or not and, in the later case,how many errors there are. An ‘to the semanticallyrich feedbackit generatesand its ability
error flag messageinforms the studentab&t the clause*in to adapt,to a particular student.There are many possibilities
which the error occurred.A hint-type:messagegives more for extending this research. More research is needed on
information about the type of error, as illustrated in figure pedagogicalrules and problem-selectingstrategies.Related
4. Here, the student is given a general descri&ion of the arks ‘in the databasearena, such as relational algebra and
cause of the error. Partial solution feedbackdisplays the calculus, data modeling or nomialization, could serve as
correct content bf the. clause in question, , while the domains for other small instructional tools and be
complete soiution sbply displays the correct solution of connected with SQL-Tutor into a database,_exploration
the current problem. “worldl’.
-j, ,/~ ;.
Problemsare also selectedon the basis of a studentmodel.
SQL-Tutor examines the student model and s’elects 8 Refer’ences
problem for a constraint that’ the student has violated
ii Ander$on,’ J.k; Corbktt, A.T., Koedinger, K.R. and
before, or a problem that requiresthe use of a constraintnot
‘Pdletier, R. Cognitive ‘Tutors: LessonsLearned. The
used by the student.The systemalso allows,the studentto
Journal of the Learning Sciences 4, (1995) 167-207
select the problem on his/her own. Such an approach
(1995).
introduces randomness in the coverage of co&raints,
2.. Bloom, B. The 2 Sigma Problem: The Search for
which can meari that the student in practising the use of
Methods of Grotip Instruction as Effective as One-to-
someknown cotistraint or even introducing new ones.,The
one Tutoring. Educational Researcher 13, (1984) 3-16.
randomnessthus provides for challengeand/or review, and
3. Dietrich, S. WinRDBI: a Windows-Based Relational
at the sametime helps control for potential inaccuraciesin
‘DatabaseEdudationalTool. In SIGCSEP7 126-130.
the student model. Admittedly, the problem selection
4. ghnasri, R and Navathe, S.B. Fundamentals of
strategiesjust discussedaretoo simple and we are currently
s,_database qstems (2nd ed.). Benjamin/Cummings,
developing more sophisticatedones. . . ” 2. 2
Redwood,CA, 1994. ’
SQL-Tutor is based on guided disc&&y ahd’learning-by- ;. Franz Inc. Allegro Common Lisp, 1996.
doing. It supports three kinds of’ learning: conceptual, 6. Kearns, R, Shead, S. and Fekete, A. A Teaching
problem solving and me&learning. The @dent can learn System for SQL. In Australasian Computer Science
about concepts and elements,,!of SQL by asking for Education ACSE’97,ACM Press,(1997) 224-231.
explanations, using menu option’sand interface don’trols. 7. Mitrovic, A. SQL-Tutor: a Preliminary report. Tech.
SQL-Tutor is a problem-solving environment th& supports Rep., Computer Sci&ce Dept., Univ. of Canterbury,
acquisition of domain knowledge in a declarativeform (i.e. TR-COSC 08/?7,1997.
constraints) and strengthening of ,+is knowledge in 8. Ohlsson, S. Cons&&t-Based StudentModeling. In J.E.
practice. SQl-Tutor provides assistancehi problem solving Greer and,G.I. McC+la (eds.): Student Modeling: the
and argumentsagainstincorrect actions.Finally, th$ system I@ to Individualized fiowledge-Based Insiruction.
encouragesme&learning by supporting self-explanation Springer--Verlag,Berlin (1994) 167-l 89.
9. Pratt, P.J. A Guide to SQL, Boyd & Fraser, Boston,
* In casethatthereareseveralmessages for variousclauses,the 1990.
pedagogical modulewill selectonepf themto startwith.

311

You might also like