You are on page 1of 44

Databases

course book
Version 4.1 (24 September 2014)
Free University of Bolzano Bozen Paolo Coletti

Introduction
This book contains the relational databases and Access courses lessons held at the Free University of
Bolzano Bozen. The book is divided into levels, the level is indicated between parentheses after each
sectionstitle:
studentsofInformationSystemsandDataManagement27000courseuselevel1;
studentsofInformationSystemsandDataManagement27006courseuselevels1,2and3;
studentsofComputerScienceandInformationProcessingcourseuselevels1,2and3;
studentsofAdvancedDataAnalysiscourseuselevels2and5.
This book refers to Microsoft Access 2010, with referrals to 2007 and 2003 in footnotes, to MySQL
CommunityServerversion5.5andtoHeidiSQLversion7.0.0.
Thisbookisincontinuousdevelopment,pleasetakealookatitsversionnumber,whichmarksimportant
changes.

Disclaimers
This book is designed for novice database designers. It contains simplifications of theory and many
technicaldetailsarepurposelyomitted.

Table of Contents
INTRODUCTION.......................................................1 2.5. REPORTS(LEVEL3).......................................22
TABLEOFCONTENTS................................................1 3. MYSQL(LEVEL5)............................................23
1. RELATIONALDATABASES(LEVEL2)..................2 3.1. HEIDISQL...................................................23
1.1. DATABASEINNORMALFORM............................2 3.2. INSTALLINGMYSQLSERVER...........................25
1.2. RELATIONS....................................................3 4. SQLLANGUAGEFORMYSQL(LEVEL5)............29
1.3. ONETOMANYRELATION.................................5 4.1. BASICOPERATIONS.......................................29
1.4. ONETOONERELATION....................................6 4.2. SIMPLESELECTIONQUERIES............................29
1.5. MANYTOMANYRELATION...............................7 4.3. INNERJOINS................................................31
1.6. FOREIGNKEYWITHSEVERALRELATIONS...............9 4.4. SUMMARYQUERIES......................................33
1.7. REFERENTIALINTEGRITY.................................10 4.5. MODIFYINGRECORDS....................................34
1.8. TEMPORALVERSUSSTATICDATABASE................11 4.6. EXTERNALDATA...........................................34
1.9. NONRELATIONALSTRUCTURES........................11 4.7. TABLES.......................................................35
1.10. ENTITYRELATIONSHIPMODEL(LEVEL9)............12 5. DESIGNINGADATABASE(LEVEL2).................38
2. MICROSOFTACCESS(LEVEL1)........................14 5.1. PAPERDIAGRAM..........................................38
2.1. BASICOPERATIONS........................................14 5.2. BUILDINGTHETABLES....................................39
2.2. TABLES(LEVEL1)..........................................15 5.3. INSERTINGDATA...........................................41
2.3. FORMS(LEVEL3)..........................................18 6. TECHNICALDOCUMENTATION(LEVEL9)........42
2.4. QUERIES(LEVEL1)........................................19 6.1. MYFARMEXAMPLE.......................................42
PaoloColetti Databasescoursebook

1. Relational databases (level 2)


Thischapterpresentsthebasicideasandmotivationswhichliebehindtheconceptofrelationaldatabase.
Readerswithpreviousexperienceinbuildingschemasforrelationaldatabasescanskipthispart.
Arelationaldatabaseisdefinedasacollectionoftablesconnectedviarelations.Itisalwaysagoodideato
havethistableorganizedinastructuredwasthatiscallednormalform.

1.1. Database in Normal Form


The easiest form of database, which can be handled even by Microsoft Excel, is a single table. To be a
databaseinnormalform,thetablemustsatisfysomerequisites:
1. the first line contains the headers of the columns, which univocally define the content of the
column.Forexample:
Studentnumber Name Surname Telephone
2345 Mary Smith 0471234567

2. eachcolumncontainsonlywhatisindicatedinitsheader.Forexample,inacolumnwithheader
telephonenumberwemaynotputtwonumbersorindicationonthepreferredcallingtime,such
asinthesecondrowofthistable:
Studentnumber Name Surname Telephone
2345 Mary Smith 0471234567
2348 John McFlurry 0471234567or3378765432

3. eachrowreferstoasingleobject.Forexample,theremaynotbearowwithinformationonseveral
objectsoronagroupofobjects,suchasinthesecondrowofthistable:
Studentnumber Name Surname Degreecourse
2345 Mary Smith EconomicsandManagement
Startingwith5 LogisticsandProductionEngineering


4. rowsareindependent,i.e.nocellhasreferencestootherrows,suchasinthesecondrowofthis
table:
Studentnumber Name Surname Notes
2345 Mary Smith
2376 John Smith isthebrotherof2345

5. rowsandcolumnsaredisordered,i.e.theirorderisnotimportant.Forexample,thesefourtables
arethesameone:
Studentnumber Name Surname Studentnumber Name Surname
2345 Mary Smith 2376 John McFlurry
2376 John McFlurry 2345 Mary Smith

Name Studentnumber Surname Surname Studentnumber Name


Mary 2345 Smith McFlurry 2376 John
John 2376 McFlurry Smith 2345 Mary

Page2of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

6. cellsdonotcontainvalueswhichcanbedirectlycalculatedfromcellsofthesamerow,suchasin
thelastcolumnofthistable:
Studentnumber Name Surname Tax1stsemester Tax2ndsemester Totaltax
2345 Mary Smith 550 430 980
2376 John McFlurry 450 0 450

Databaserowsarecalledrecordsanddatabasecolumnsarecalledfields.
Singletabledatabasescanbeeasilyhandledbymanyprogramsandbyhumanbeings,evenwhenthetable
isverylongorwithmanyfields.Therearehoweversituationsinwhichasingletableisnotanefficientway
tohandletheinformation.

1.1.1. Primary key


Eachtableshouldhaveaprimarykey,whichmeansafieldwhosevalueisdifferentforeveryrecord.Many
timesprimarykeyhasanaturalcandidate,asforexamplestudentnumberforastudentstable,taxcode
foracitizenstable,telephonenumberforatelephonestable.Othertimesagoodprimarykeycandidateis
difficulttodetect,forexampleinacarstablethecarnameisnotaprimarykeysincetherearedifferent
seriesanddifferentmotortypesofthesamecar.Inthesecasesitispossibletoaddanextrafield,calledID
orsurrogatekey,withaprogressivenumber,tobeusedasprimarykey.Inmanydatabaseprogramsthis
progressivenumberishandleddirectlybytheprogramitself.
Itis alsopossibletodefine asprimary keyseveralfieldstogether,forexampleina people tablethefirst
nametogetherwiththelastname,togetherwithplaceanddateofbirthformauniquesequenceforevery
person. In this case the primary key is also called composite key or compound key. On some database
managementprogramshoweverhandlingacompositekeycancreateproblemsandthereforeitisabetter
ideatouse,inthiscase,anID.

1.2. Relations
1.2.1. Information redundancy
Insomesituationstryingtoputtheinformationweneedinasingletabledatabasecausesaduplicationof
identicaldatawhichcanbecalledinformationredundancy.Forexample,ifweaddtoourstudentstable
the information on who is the reference secretary for each student, together with other secretarys
informationsuchasofficetelephonenumber,officeroomandtimetables,wegetthistable:
Studentnumber Name Surname Secretary Telephone Office Time
2345 Mary Smith AnneBoyce 0471222222 C340 1418
2376 John McFlurry JessyCodd 0471223334 C343 911
2382 Elena Burger JessyCodd 0471223334 C343 911
2391 Sarah Crusa AnneBoyce 0471222222 C340 1418
2393 Bob Fochs JessyCodd 0471223334 C343 911

Informationredundancyisnotaproblembyitself,but:
storing several times the same information is a waste of computer space (hard disk and memory),
whichforaverylargetable,hasabadimpactonthesizeofthefileandonthespeedofeverysearch
orsortingoperation;
wheneverweneedtoupdatearepeatedinformation(e.g.thesecretarychangesoffice),weneedto
doalotofchanges;
manually inserting the same information several times can lead to typing (or copying&pasting)
mistakes,whichdecreasethequalityofthedatabase.

Version4.1(24/09/2014) Page3of44
PaoloColetti Databasescoursebook

Inordertoavoidthissituation,itisacommonproceduretosplitthetableintotwodistincttables,onefor
thestudentsandanotheroneforthesecretaries.Toeachsecretaryweassignauniquecodeandtoeach
studentweindicatethesecretaryscode.
Students
Studentnumber Name Surname Secretary
2345 Mary Smith 1
2376 John McFlurry 2
2382 Elena Burger 2
2391 Sarah Crusa 1
2393 Bob Fochs 2

Secretaries
Secretarycode Name Surname Telephone Office Time
1 Anne Boyce 0471222222 C340 1418
2 Jessy Codd 0471223334 C343 911

In this way the information on each secretary is written and stored only once and can be updated very
easily.Thepriceforthisisthateverytimeweneedtoknowwhoisastudentssecretarywehavetolookat
its secretary code and find the corresponding code in the Secretaries table: this can be a long and
frustratingprocedureforahumanbeingwhentheSecretariestablehasmanyrecords,butisveryfasttask
foracomputerprogramwhichisdesignedtoquicklysearchthroughtables.

1.2.2. Empty fields


Another typical problem which arises with single table databases is the case of many empty fields. For
example,ifwewanttobuildanaddressbookwiththetelephonenumbersofallthepeople,wewillhave
somebodywithnotelephonenumbers,manypeoplewithafewtelephonenumbers,andsomepeoplewith
a lot of telephone numbers. Moreover, we must also take into consideration that new numbers will
probablybeaddedinthefuturetoanybody.
Ifwereserveafieldforeverytelephone,thetablelookslikethis:
Name Surname Phone1 Phone2 Phone3 Phone4 Phone5 Phone6 Phone7
Mary Smith 0412345
John McFlurry 0412375 3396754
Elena Burger 0412976 3397654 0436754 3376547 0487652 3387655 0463456
Sarah Crusa 0418765 0412345
Bob Fochs 0346789 0765439 3376543

Asitisclear,ifwereserveseveralfieldsforthetelephonenumbers,alotofcellsareempty.Theproblems
ofemptycellsare:
anemptycellisawasteofcomputerspace;
thereisafixedlimitoffieldswhichmaybeused.Ifarecordneedsanotherfield(forexample,Elena
Burgergetsanothertelephonenumber)theentirestructureofthetablemustbechanged;
since all these fields contain the same type of information, it is difficult to search whether an
informationispresentsinceitmustbelookedforineveryfield,includingthecellswhichareempty.
In order to avoid this situation, we again split the table into two distinct tables, one for the people and
anotheronefortheirtelephonenumbers.Thistime,however,weassignauniquecodetoeachpersonand
webuildthesecondtablewithcombinationsofpersontelephone.

Page4of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

People Telephones
Personcode Name Surname Owner Number
1 Mary Smith 1 0412345
2 John McFlurry 2 0412375
3 Elena Burger 2 3396754
4 Sarah Crusa 3 0412976
5 Bob Fochs 3 3397654

3 0436754
3 3376547
3 0487652
3 3387655
3 0463456
4 0418765
4 0412345
5 0346789
5 0765439
5 3376543

Even though it seems strange, each persons code appears several times in the Telephones table. This is
correct, since Telephones table uses the exact amount of records to avoid having empty cells: people
appearasmanytimesasmanytelephonestheyhave,andpeoplewithnotelephonedonotappearatall.
Thedrawbackisthateverytimewewanttogettoknowtelephonenumberswehavetogothroughthe
entire Telephones table searching for the persons code, but again this procedure is very fast for an
appropriatecomputerprogram.

1.2.3. Foreign key


Whenafield,whichisnottheprimarykey,isusedinarelationwithanothertablethisfieldiscalledforeign
key. Thisfieldisimportantforthedatabasemanagementprogram, suchasAccess,whenit hastocheck
referentialintegrity(seesection1.6).
For example, in the previous examples Owner is a foreign key for Telephones table and Secretary is a
foreignkeyforStudentstable.

1.3. Onetomany relation


ArelationisaconnectionbetweenafieldoftableA(whichbecomesaforeignkey)andtheprimarykeyof
tableB:ontheBsidetherelationis1,meaningthatforeachrecordoftableAthereisoneandonlyone
corresponding record of table B, while on the A side the relation is many (indicated with the
mathematical symbol ) meaning that for each record of table B there can be none, one or more
correspondingrecordsintableA.
Fortheexampleofsection1.2.1,thetablesareindicatedinthisway,meaningthatforeachstudentthereis
exactlyonesecretaryandforeachsecretarytherearemanystudents.Thisrelationiscalledmanytoone
relation.

Version4.1(24/09/2014) Page5of44
PaoloColetti Databasescoursebook

Students Secretaries
Studentnumber 1 ID
Name Name
Surname Surname
Secretary Telephone
Office
Time

Fortheexampleofsection1.2.2,thetablesareinsteadindicatedinthisway,meaningthatforeachperson
therecanbenone,oneorseveraltelephonenumbersandforeachnumberthereisonlyonecorresponding
owner.Thisrelationiscalledonetomanyrelation.

People Telephones
1
Personcode Owner
Name Number
Surname

Clearlyonetomanyandmanytoonearethesamerelation,theonlydifferencebeingtheorderofdrawn
tables.
It is however very important to correctly identify the 1 side, since it has several implications on the
correct working of the database. For example, in the previous example putting the 1 side on the
Telephones table means that for each person there is only one telephone and that for each telephone
therearemanypeople,asituationwhichispossibleuptothe90s,whentherewasonlyonetelephonefor
awholefamilyusedbyallitscomponents,butwhichisnotwhatwewanttodescribewiththecurrent21st
centurysdatabase.Moreover,reversingtherelationalsoneedtochangealittlethestructureofthetables,
puttingtheforeignkeyTelephoneinthePeopletableinsteadoftheforeignkeyPersonintheTelephones
table,suchas

People
Personcode Telephones
1 Number
Name
Surname
Telephone

1.4. Onetoone relation


Aonetoonerelationisadirectconnectionbetweentwoprimarykeys.Eachrecordofthefirsttablehas
exactly one corresponding record in the second table and vice versa. An example can be countries and
nationalflags.Thisrelationcansometimesbeusefultoseparateintwotablestwoconceptuallydifferent
objectswithalotoffields,butitshouldbeavoided,sincethetwotablescanbeeasilyjoinedtogetherina
singletable.

Page6of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

Countries
Name
Countries Flags Size
Name 1 1 ID
Population
Size Shape Continent
Population Picture Flagshape
Continent Flagpicture

1.5. Manytomany relation


Even though a manytomanyrelation isverycommon inreal applications,unfortunately theycannotbe
handled automatically by relational databases. In order to deal with them, relational databases use a
junctiontable,whichisanextratablewiththetaskofconnectingtogetherthetwofieldswhicharemany
tomany related; sometimes this junction table has an corresponding meaning in everyday experience,
othertimesitisonlyanabstractrepresentationoftherelation.Inanycase,itisalwaysagoodideatogive
ameaningfulnametothejunctiontable,oftenusingaquestionformsuchasWhatisownedbywhom,to
havealwaysclearlyinminditsmeaning.
Forexample,webuildadatabasewithhousesandowners.Eachhousemaybeownedbyseveralpeople
(withdifferentpercentagesor,ifwearebuildinganhistoricaldatabase,withdifferentstartingandending
dates),andontheotherhandeachpersonmayownseveralportionsofhouses.Inordertorepresentthis
manytomany relation between houses and owners we use a junction table which can be called either
WhatisownedbywhomorWhoownswhator,usingamoretangiblename,PropertyActs.

Houses PropertyActs Owners


Address 1 Actnumber 1 Taxcode
Squaremeters Percentage Name
Height House Surname
Constructionyear Owner Birthplace
Begindate Birthdate
Enddate


Each owner can therefore have many property acts and each house can have many property acts which
refertothathouse.Ontheotherhandeachpropertyacthaswrittenonitonlyoneownerandonehouse.
Thisisthetypicalstructureofthejunctiontable:itcontainstwoormoreforeignkeysonthemanysideof
the relation. An example where the junction table contains four foreign keys is this database of car
competitionswithCarTypes,Tires,Races,Drivers.

Version4.1(24/09/2014) Page7of44
PaoloColetti Databasescoursebook

Drivers CarTypes
Taxcode 1 1 Cartype
Name Brand
Surname Participants Enginecc

Address Carplate Speed
Cartype
Driver
Tires
Race
Tires 1 1 Races
Racetime
Tirename Racename
Arrivalposition
Radial Date
Type Length
Width

1.5.1. Details table


Manytimesineverydayapplicationstherelationissocomplicatedthatajunctiontableisnotenough.This
is the case, for example, of a selling database, with table Customers and table Products. Clearly each
customer may order different products and each products is hopefully ordered by several customers,
thereforeweneedanOrdersjunctiontable.Thistablecontainsalsoallthedetailsoftheorder,suchasthe
amountofproducts,thedateandtheshippingcost.

Customers Orders 1 Products


CustomerID 1 Ordernumber Productcode
Name Date Description
Surname Customer UnitPrice
Address Product Category
Shippingcost Weight
Amount


However,whileitiscorrectthatforeachorderthereisoneandonlyonecustomer,foreachorderthereis
alsooneandonlyoneproduct,whichisnotwhatusuallyhappensinrealapplicationswhereacustomer
ordersseveralproductsatthesametimeandwantsalsotopaythemalltogetherwithcombinedshipping
costs.
In order to deal with this situation, we need a details table. We leave all the orders administrative
information,includingthecustomerrelation,intheOrderstableandwemovethelistoforderedproducts
intothedetailstable,whichwilllookliketheTelephonestableofsection1.2.2.

Customers Products
CustomerID 1 Orders OrderDetails 1 ProductID
OrderID 1 ID
Name Description
Date
Surname OrderID UnitPrice
Address CustomerID ProductID Category
Shippingcost Quantity Weight

Page8of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

EachrecordintheOrderDetailstablerepresentsaproductwhichisorderedwithitsamountandclearlyan
ordercanhaveseveraldetails.InthiswayanentireordercanberepresentedtakingfromtheCustomers
tabletheinformationonwhoorderedit,fromtheProductsthroughtheOrderDetailstabletheinformation
ontheproductsandfromtheOrderstableitselftheadministrativeinformation.
Usingqueriesandreports(explainedinsections2.4and2.5forAccess)allthesedatacanbeconveniently
put together, taking them from the tables and automatically joining them following the relations, into a
reportlikethisone.

Orders
OrderID,Date

Customers
Name,Surname,Address

OrderDetails

Products
Product,Description,WeightAmount

Products
Product,Description,WeightAmount

Products
Product,Description,WeightAmount




Orders
Shippingcost

Adetailstableisingeneralusedeverytimethejunctiontable,evenwithseveralforeignkeys,isnotenough
todescribetherelation.Insomecasesfurthersubdetailtablesmaybeevennecessary.

1.6. Foreign key with several relations


Consideradatabasewithpeopleandcompanies.Clearlythesetwoobjectsmustbeintwodifferenttables
sincetheyrequiredifferentfields.Ifhoweverweneedtobuildatablecontainingphonesweeitherhaveto
buildtwodistincttablesas:

Peoples People Companies Companies


phones 1 ID phones 1 ID
Number Name Number Name
Owner Surname Owner Type
Birthdate Administrator

An alternative schema is the following, which uses two relations coming out from the same foreign key
field:

Version4.1(24/09/2014) Page9of44
PaoloColetti Databasescoursebook

Companies People
1 1 ID
ID Phones
Name Number Name
Type Owner Surname
Administrator Birthdate
However this schema creates a technical problem: many database management programs which
automatically follow relations, such as Access, do not know whether to follow the first or the second
relation inorderto find the phones owners name.Therefore, if thedatabase designerdoes nothave a
good experience, it is better to avoid this second schema and to choose, according to the problem, the
moreappropriatebetweenthefirstoneorthisthirdone:

PeopleCompanies
Phones 1 ID
Number Company(yes/no)
Owner
Name
Personsurname
Personbirthdate
Companytype
Companyadministrator

filling intononappropriatefields(suchasPersonsurname andPerson birthdatewhenrecordreferstoa
company)anemptyvalue,technicallycalledNull.

1.7. Referential integrity


If two tables are related via a manytoone relation, like the one between students and secretaries of
section1.2.1,wearenomorefreetomodifythedataonthe1sideatourwill.Forexample,ifwedelete
asecretaryofifwechangeitsID,thereareprobablycorrespondingstudentsintheStudentstablewhich
becomesorphans,i.e.theydonothavetheircorrespondingsecretaryanymoreandfollowingtheirrelation
totheSecretariestableleadstoanonexistentID.Thisissueisknownasreferentialintegrity,whichisthe
propertyofadatabasetohavealltheforeignkeysdatacorrectlyrelatedtoprimarykeysdata.Whena
record on the 1 side table is deleted, referential integrity can be broken and this results in a non
consistentdatabase.

Students Secretaries
Studentnumber 1 ID
Name Name
Surname Surname
Secretary Telephone
Office



Secretaries
Secretarycode Name Surname Telephone Office Time
1 Anne Boyce 0471222222 C340 1418
2 Jessy Codd 0471223334 C343 911

Page10of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

Students
Studentnumber Name Surname Secretary
2345 Mary Smith 1 orphan
2376 John McFlurry 2
2382 Elena Burger 2
2391 Sarah Crusa 1 orphan
2393 Bob Fochs 2

Ontheotherhand,ifatablehasonlymanysiderelations,itsrecordscanbefreelydeletedandmodified
withoutbreakingthereferentialintegrity.
Somedatabasemanagementprogram,asAccess,correctlycheckreferentialintegrityifinstructedtodoso
andforbidsdangerousoperations.Others,asMySQLuptoversion5.6,doesnotcheckandwemusttake
specialcarewhendeletingrecords.

1.8. Temporal versus static database


Itisacommonmistake,whendecidingtheschemaofthedatabase,tolimitittoaninstantaneousviewof
reality,insteadofbuilding,oftenwiththesamedesigneffort,adatabasewhichcanalsohandlehistorical
information.Atemporaldatabasedoesnotonlyoffertheopportunitytohandlepastdata,butalsogives
thechancetoeasilyreverttotheprevioussituationincaseofinputerrors,whichforastaticdatabaseis
oftenimpossiblesincedatahavebeenoverwritten.
Forexample,thePropertyActstableinsection1.5couldbeastatictable,reflectingthecurrentstatusquo
oftheproperty,oranhistoricaltablewiththebeginningandendingdateofproperty.Simplyintroducing
two dates fields has converted our database from static to temporal opening a wide range of new
possibilities.

1.9. Nonrelational structures


Therearesomestructureswhichareratherdifficulttomodelwitharelationsdatabaseandcausecommon
errors for nonexpert users, since they require a restructuring of the commonly used diagram to be
correctlyimplementedbyarelationaldatabase.

1.9.1. Hierarchical structure


Arelationaldatabasehassevereproblemsmodelinghierarchicalstructuressuchasacompanyemployees
organizationorafamilygenealogicaltree.Situationswithanintrinsichierarchycanstillbemodeledbya
relational database, using the relation to model the depends on but the hierarchy will not be easily
observablefromthedatabase;

People
President Personcode Roles
1 Rolename
Name
Surname Dependson

Director1 Director2 Director3 Role

Version4.1(24/09/2014) Page11of44
PaoloColetti Databasescoursebook

1.9.2. Process
Arelationaldatabasecannotmodelaprocess,suchasasequenceofproductionstepsoractivitiesstatuses.
Again processes can be somehow modeled by a relationaldatabasemimickingthe sequencewitha field
statusandwitharelationissubsequentof.

Products
Productcode Stages
First Second Third 1 Stagename
Entrydate
stage stage stage Description Follows

Status

1.10. Entityrelationship model (level 9)


Theschemadiagramusedsofarisverycomprehensiblebuttheentityrelationshipsmodelisanothermore
useddiagramproposedbyPeterChen1.Inthisdiagramweuse:
entity,thecorrespondentoftableusedbefore,indicatedwitharectanglewiththeentitysname;
attribute,thecorrespondentoffieldusedbefore,indicatedwithanellipsewithattributesname;
relationship,indicatedwithadiamondwithrelationsname;
cardinalitiesofarelation,indicatedwith
- asinglelinetorepresenttheonesideoftherelation,i.e.eachelementoftheentitycanhavean
undeterminednumberofcorrespondentsontheotherside;
- a double line to represent the many side of the relation, i.e. each element of the entity must
haveacorrespondentontheotherside;
- alinewithanarrowtorepresentthateachelementoftheentityhaszerooronecorrespondent
ontheotherside.Forexample,iscurrentlymarriedwithdiamondhasbothlineswitharrows;
- athicklinetorepresentthateachelementoftheentityhasexactlyonecorrespondentonthe
otherside,usedtomodelonetoonerelations.
Themajordifferencesare:
manytomany relations are not indicated building another entity but directly with a diamond,
exactlylikeonetomanyrelations;
each relation must have a name, even onetomany relations (even though in databases
implementationdonotcorrespondtoatableandthusdonotneedaname),veryoftenusingthe
activeandpassiveformsofaverb,suchasowns/isownedbyororders/isorderedby;
incasearelationpossessesattributes,theyareindicatedasdependentsofthediamond,regardless
whether the relation is a manytomany (and thus has a table in the database implementation
whichcanhavefields)oronetomany(andthuswillnothaveatableintheimplementation);
the case in which the relation has one or zero correspondents on the other side cannot be
describedbytheotherdiagram.
In this way the database designer can concentrate much more on the modeling of the situation, leaving
technicaldetailssuchasjunctiontablesorrelationspropertiestoalaterstage.

1
P.P.Chen,TheEntityRelationshipModel:TowardaUnifiedViewofData,ACMTransactionsonDatabaseSystems,
1976,vol.1,pp.936.
Page12of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

St number ID

is
Secretaries
Name Student assigned Name
to

Surname Surname

1.10.1. Enhanced entityrelationship model
Anenhancedentityrelationshipsmodelisanimprovementwhichpermitsalsotheexistenceofsubclass,an
entity which inherits allthe attributes and relationshipsof anotherentity called superclass,adding some
extraattributesandrelationshipsofitsown.Forexample,adatabasezoocanhaveentityanimalswith
attributesscientificnameandcommonname,whichcanhaveassubclasstheentitybirdswiththe
sametwoattributesandtheextraoneaveragewingspan.Thispermitsthemodelingofsomehierarchical
structures(seesection1.9.1onpage11)andsolvestheproblemofforeignkeyswithseveralrelations(see
section1.6onpage9).

Version4.1(24/09/2014) Page13of44
PaoloColetti Databasescoursebook

2. Microsoft Access (level 1)


MicrosoftAccess2010isadatabasemanagementprogram,aprogramwhichisinchargeofhandlingdata
and extracting them following correctly the relations and doing other sorting and filtering operations.
Moreover, Access takes care of preserving the correct structure of the database enforcing referential
integrity(seesection1.6).
OtherfamousdatabasemanagementprogramsareOracle,MySQL,PostgreSQL.

2.1. Basic operations


AccessbehavesinaslightdifferentwaywithrespecttoWordorExcelandOfficeuserscaneasilybecome
confused.
ThemostimportantthingtokeepinmindisthatAccessautomaticallysaveseverydataoperationassoon
asitisdone,withoutneedingtogivethesavecommand.Exceptionallyonlythelastdatamodificationcan
beundone,butnottheothers.Thisbehavioristypicalofdatabases,whereseveralpeoplemustaccessthe
dataatthesametimeandthereforedatamustalwaysbeuptodate.ThereisaFile Save2commandin
Access,butitistypicallyusedtosavetheobjects(tables,queries,reports,forms)insidethedatabasefile,
whilethereisnoneedtouseittosavethewholedatabasefileontheharddisk.Ifwewanttosavethe
databasefilewithanothernameorinanotherformat,therighttooltodoitisthecommandFile Save
DatabaseAs3.
Whenthedatabaseisopened,themainwindowusuallyappearsontheleft4.Choosefromthedropdown
menu Object Types and then select All Access Objects. Each object category presents the list of all that
objects together with two buttons to manually create another object or to create it with the help of a
wizardtool.

2.1.1. Northwind example


MicrosoftAccessprovidesanofficialdatabaseexamplecalledNorthwind.Itisbettertodownloadfromthe
courseswebsitethe2003version,whichismuchsimpler.5
Whenthisdatabase(oranyotheronecontainingVisualBasicmacros)isopened,Accessusuallydisplaysa
Security Warning6, which can be ignored clicking on Enable. Then, this database example presents at its
activationasplashwindow,i.e.agraphicalinterfacewhichleadstheusertothemostcommonoperations7.
Buildinggraphicalinterfacesisbeyondthescopeofthisbookanditcanbeignoredarrivingdirectlyatthe
databasemainwindow,whichiscomposedofaleftwindowdisplayingallthedatabasesobjectsandaright
windowdisplayingthecurrentlyopenobject.

2
ForAccess2007OfficebuttonSave.
3
InAccess2007OfficebuttonSaveAsSavethedatabaseinanotherformatOfficebuttonManageBackUp
Database,whileinMicrosoftAccess2003thereisonlytheoptionFileBackupDatabase.
4
InAccess2003itisafloatingwindowwiththeobjecttypesmenuontheleftandAllAccessObjectsdoesnotneedto
beselected.
5
Also Access 2007 and 2010 have a Northwind 2007 database, but it must be downloaded or generated from a
templatefile.Moreover, it has a much morecomplicated structure which can misleadthe novice user. Office 2007
usersshouldcopytheNorthwind.mdbfilefromanOffice2003systeminC:\MicrosoftOffice\Office11\Samples,or,ifa
computerwithOffice2003isnotavailable,searchforthisfileontheInternetwiththehelpofasearchengine.
6
WithAccess2007clickonOptionsEnablethisContent,whileAccess2003displaysinsteadthreepopupwindows
to which No then Yes then Open should be answered; they can be permanently disabled choosing Tools
MacroSecurityLow.
7
Access2003displaystwosplashwindows.
Page14of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

2.1.2. Relationships diagram (level 3)


AccessprovidesatooltoautomaticallydisplaythedatabasesschemathroughcommandDatabaseTools
RelationshipsRelationships8.Thisopensagraphicalinterfacedisplayingtables,fieldsandrelations.
Thistoolhashoweversmallproblems:
ifsometablesaremissing,rightclickShowall;
ifnonexistenttablesaredisplayed,oftenwithan_1extension,selectthemandpressDel;
when closing the relationships diagram Access asks to save it. This only saves the layout of the
diagram,themodifiedrelationsareinsteadsavedimmediately.
Intherelationshipsdiagramrelationscanbecreated,deletedandmodified:
todeletearelation,selectitrightclickDelete.Warning:thisoperationcannotbeundone;
tomodifyarelation,doubleclickonit.Agraphicalinterfaceappears,whichdisplaysthetwotables
and the related fields together with the relation type, which is automatically decided by Access
accordingtothestructureofthetablesandtothefieldsinvolved.Moreover,thereisalsoacheckbox
to enforce referential integrity: it is very important that this box is checked because in this way
Access will forbid the deletion of records in the 1 side table when this operation breaks the
referentialintegrityofthisrelation;
tocreatearelationbetweentwofields,simplydragafieldabovetheother.Ifatleastoneofthetwo
fields is a primary key, Access automatically recognizes the relations type and the only remaining
thingtodoistoapplyreferentialintegrity.
It is always better to do any structures modification before filling the tables with data. Once data are
inside,Accessmayrefusetodocertainoperationsonrelationswhenthepresentdataareinconsistentwith
thenewrelations,ormaydeletedatainside foreignkeyfieldswhentheydonotcomplainwiththenew
relation.
RelationscanalsobebuiltwithLookupWizard,asdescribedbelow.UsingthistoolAccesscreatesatthe
sametimetherelationand,intheforeignkeyfield,auserfriendlydropdownmenuwhichspeedsupdata
insertion.

2.2. Tables (level 1)


TablesareaccessedchoosingTablesobjectinthemaindatabasewindow.
ThebestwaytocreateanewtableisCreate Tables TableDesign9.Thisiconopensanemptytablein
Design View, where fields can be added with the indication of the primary key, the field name, the field
typeandthe fielddetaileddescription.Rightclickingontheleft columngivesthepossibilitytodefineor
removeaprimarykey,whilethefieldtypecanbechosenfromadropdownmenuinthirdcolumn.
Eachexistingtableinthemaindatabasewindowcanbeopeneddoubleclickingonit.Thetableisdisplayed
inDatasheetView,whichisanExcellikewaytolookatthetableandwhichisalsoaconvenientwaytoedit
orinsertdata.InordertoseethetableinDesignView,wechooseHomeViewDesignView10.

2.2.1. Field types


Itisveryimportanttochoosethecorrectfieldtypebecauseithelpsthedatabasetominimizethespace
allocationandtoavoidwronginsertions.InAccessthemostimportantfieldtypesare:

8
ForAccess2007DatabaseToolsShow/HideRelationships,forAccess2003FileRelationships.
9
ForAccess2003insteadchooseCreatetableinDesignViewfromthemaindatabasewindow.
10
ForAccess2003ViewDesignView.

Version4.1(24/09/2014) Page15of44
PaoloColetti Databasescoursebook

Text, which contains up to 255 alphanumeric characters. This type is proper for names, addresses
andeveryshorttext;
Memo,whichcontainsupto65,536alphanumericcharacters.Thistypeisproperforlongtexts,such
ascurricula,abstractsandsmallarticles;
Number,whichcontainsnumberswhichcanbemanipulatedthroughmathematicaloperations.This
typemustbeusedonlyfornumericalinformation.Itisaverycommonmistaketouseitfornumeric
codes,suchastelephonenumbers,versionnumbersandZIPcodes.Numericcodesmustusethetext
type,sincetheymaystartwith0(telephone0471012343orZIP00100)orhavea0aslastdecimal
digit(version7.10)and,inanycase,mathematicaloperationswiththemmustbeforbidden;
Date/Time, which contains dates and times. Exactly like Excel, Access memorizes date and time
together,usingintegernumbersfordays.Thereforedatescanbesubtractedtoobtainthedifference
in days, or numbers can be added or subtracted to them to go ahead in the future or back in the
past;
Currency,whichcontainsnumberswithautomaticallyacurrencysymbol;
Autonumber,atypewhichisusedonlybyAccesstocreateIDs;
Yes/No,atypewithonlytwovalues;
OLEobject,whichcontainsotherfiles,suchasimagesordocuments.Thesefilesmaybeembedded,
whichmeans that the database file automaticallycontainsaduplicateofthis file(thus making the
databasefilelarger)orlinked,whichmeanthatthedatabasefilesimplycontainsthelinktothefile
(thusmakingitunusablewhentheexternalfileisnotavailable);
Hyperlink,whichcontainsanhyperlinkusuallytoawebpageortoanemailaddress;
LookupWizardisnotarealfieldtypebutthepossibilitytotakefieldsvaluesfromapredetermined
listorfromothertables.

Lookup Wizard (level 3)


FromtheLookupWizard,choosingIwilltakethevaluesfromanothertable,Accessautomaticallybuildsa
relationwithanothertabletakingthecurrentfieldasforeignkeyofthemanysideandtheprimarykeyof
theselectedtableas1sideoftherelation.Atthesametime,duringthebuildingofthisrelation,Access
askswhichfieldstheuserwantstodisplayinthedropdownmenu.Sincethesefieldsaresimplywhatwill
bedisplayed,nottherealfieldinvolvedintherelation(whichisinsteadtheprimarykey),itisconvenientto
choose the most meaningful fields for data editing (for example, name, surname and birth date of a
person).Attheendofthewizardproceduretherelationiscreatedbutremembertoenforcethereferential
integrityinthelastscreen11.
FromtheLookupWizard,choosinginsteadIwilltypeinthevaluesthatIwant,Accessletstheusertypein
apredeterminedlistofvalueswhichwillbeofferedeverytimethefieldisfilled.Thelistisnotmandatory
andadifferentvaluecanbemanuallytypedinthefield.

Mandatory predetermined list (level 3)


Ifthelistofpredeterminedvaluesshouldbeinsteadmandatory,youhavetheoptiontochooseduringthe
wizards procedure that the list is mandatory.12However, building the list in this way does not offer a
flexiblewayofaddingnewvaluestothelist.
It is therefore much better to build them using another table which simply contains the list of values as
primarykeyandbuild,viaLookupWizard,arelationwhichtakesvaluesforthisfieldfromtheprimarykey
ofthenewtable,suchasinthisexample.

11
For Access 2003 and 2007:this option does not appear in the last screen and thus you should add manually the
referentialintegrityfromtherelationshipsdiagram(seesection3.3).
12
ForAccess2003and2007:thisoptionisnotavailable,mandatorylistsmustbebuiltonlyusinganothertable.
Page16of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

Products
Code Categories
1 Categoryname
Name
Price
Category

With this solution the user may not choose values which are not in the list; however, he can add other
values to the list simply adding more records to the second table, without having to modify the fields
featuresinthefirsttable.

2.2.2. Field properties (level 3)


According to the chosen field type, several field properties appear on the bottom window. The most
interestingare:
FieldSize,whichrestrictsthenumberofcharacterswhichcanbeinsertedforbothtextandnumeric
fields;
Format,whichdefineshowdoesthedatalooklike;
DecimalPlaces,whichfixesthedecimaldigitsfornumbers;
DefaultValue,avaluewhichisautomaticallyassignedtothefieldwhenevertheuserdoesnottype
anything;
ValidationRule,aruletowhichvaluesofthisfieldmustadheretobeaccepted.Thisrulecanbequite
complex,butmaynotusevaluestakenfromotherfields.Forexample,toindicatethatfieldAgemust
be18orabove,thevalidationruleisasfollowing>=18;
ValidationText,thewarningdisplayedtotheuserwhenthevalidationruleisbroken;
Required,whichindicatesthatthefieldmustbefilled.Whenrequiredissettoyes,recordswithno
valueinthisfieldarenotaccepted;
Allow Zero Length, which indicates whether an empty sequence of characters (which is not
consideredanemptyvalue)canbeinsertedinatextfieldornot;
Indexed,whichindicatestoAccessthatthisfieldisgoingtobeusedforsearches.Accessorganizes
thisfieldinaspecialwaytospeedupfuturesearches.Everyforeignkeyisobviouslyindexed,while
for primary key it is not necessary to indicate it as it is implicit in the primary key indication.
Concerning other fields, for some it is obvious that they must be indexed, for example field
surnameineverypeoplestable,forothersthedecisionisuptothedatabasedesignedwhilefor
someothersitisevidentthattheyarenotindexed,suchasfieldsNotesorPicture.Indexedcanbe
DuplicatedAllowedorNoDuplicates,dependingonwhetherwewanttoallowtworecordstohave
thesamevalueforthisfield.Thisisalsoagoodtricktoforceafield,whichisnotaprimarykey,to
havealldifferentvalues.

Table validation rule (level 3)


Whileafieldvalidationrulecannotinvolveotherfields,sometimesitisnecessarytoputavalidationruleon
entereddatathatcrosschecksthevaluesofdifferentfieldsofthesametable.Todothis,inDesignView
choose Show/Hide Properties13and add a table validation rule (using also the Expression Builder, see
section2.4.2)anditscorrespondingvalidationtext.Forexample,foranhotelbookingtableitisnecessary
tohavedeparturedatesnotbeforearrivaldatesandthereforetheconditionhereis
[Departure Date] <= [Arrival Date].

13
ForAccess2003ViewProperties.

Version4.1(24/09/2014) Page17of44
PaoloColetti Databasescoursebook

Ifmorerulesareneeded,theymustbecombinedwiththeAndoperator,forexample
( [Departure Date] <= [Arrival Date] ) And ( [Booking Date] <= [Arrival Date] ).
Unfortunatelythevalidationtextisonlyoneanditisnotpossibletotelltheuserexactlytowhichpartof
therulehisdatadonotadhere.
MorecomplexexpressionscanbebuiltwiththeExpressionBuilder(seesection2.4.2).

2.2.3. Importing tables (level 3)


Accesscanobviouslyimportdatafromseveralsources,typicallytabdelimitedtextfiles,commaseparated
textfiles and Excel files. The importing operation fortextfiles is verysimilartoimporting a text file into
Excel.TheimportingoperationofanExcelfileisveryeasywiththecommandExternalData Import&
LinkExcel14.
Theonlythingtopayattentiontoisthatdatamustalreadybewellstructuredbeforeimportingthemintoa
table,otherwisetheimportingprocedurewillstopseveraltimes.

2.3. Forms (level 3)


Aformisagraphicalinterfacewhichletstheuserlook,modify,insert,deletedata.Whentheuserisnotthe
databaseadministrator,itisbetterthattablesarenotaccesseddirectly,sinceawrongvalue,especiallyina
foreignkey,canleadtoanonconsistentdatabase.Formscanpresentthedatainamoreuserfriendlyway
andcanrestrictaccesstosomedataorforbidsomedataoperations.
To produce a form in Access we build it using wizard choosing Create Forms Form Wizard15. This
guidesusthroughastepbystepprocedure,wherewe:
choosethetablesandthefieldsfromwhichdataaretaken.Ifdatacomefrommorethanonetable,
Access automatically takes into consideration the existing relations and builds an appropriate
subform for data on the many side. Data can also be taken from queries (see section 2.4), but
usuallytheyarenot;
choosethelayoutoftheform;
choosethestyleoftheform;
assigntotheformaname.Theform,beinganobject,needstobesavedinsidethedatabasefile.
TheformcanthenbeopenedinFormViewtoaccessthedata,payingattentiontothefactthat,asusual,
everydatummodificationisautomaticallyreflectedinthecorrespondingtable.
TheformcanalsobeopenedinDesignViewtochangeitslayoutandstyle,ortoaddandremovefields.
ChoosingcommandTools Properties16opensalltheformspartspropertiesfromwhichthelayoutcan
bepreciselydefined.AmongthesepropertiesthemostimportantonesareintheFormdropdownmenu
Data tab, where modifying the fields Allow Modifications, Allow Deletions and Allow Insertions
restrictstheuserfrommodifying,deleting,insertingrecordsthroughthisform.
IftheFormWizarddoesnotwork,asitcanhappenincaseofawronginstallationofAccess2010,theform
can be created without using the wizard clicking first on the main table that is to be used and then on
Create Forms Form.Thiswillcreateaformwithallthefieldsofthattableandtheunnecessaryones
can be later removed in Design View. In case a subform is needed, the table containing the data to be
insertedinthesubformcanbesimplydraggedinsidetheexistingform.

14
ForAccess2007ExternalDataImportExcel,forAccess2003FileExternaldataImport.
For Access 2007 Create Forms Other Forms Create using Wizard, for Access 2003 we instead choose
15

CreateFormusingwizardfromthedatabasemainwindow.
16
ForAccess2003ViewProperties.
Page18of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

2.4. Queries (level 1)


Aqueryisaquestionposedtothedatabase,whichanswerswithavirtualtablecalledview.Thequestion
canbe,forexample,Whichstudentsarebornin2007?orWhoaretheGermanstudents,whichistheir
address, and what are their grades averages?. The view would be in these cases a table with a single
columnwiththestudentsnumbers,oratablewithfourcolumnswithstudentsname,studentssurname,
studentsaddressandtheaverageofthegradesofthatstudent.
The view therefore contains all the values corresponding to the fields selected in the query, organized
following correctly the underlying relations, sorted and filtered according to the querys indication and
together with virtual fields created using formulas contained in the query. It is thus a powerful tool to
extractinformationfromthedatabase.
The view, even though it is virtual, is directly linked to the real data and any modification to its data is
automaticallyreflectedintheoriginaltables.Theviewdoesnotcontainanyformatting:topresentquerys
resultsinabetterlookingformat,reportistheappropriatetool(seesection2.5).

2.4.1. Selection queries


Theselectqueryisthestandardtypeofquery,adirectquestiontothedatabasewhichinvolvesonlydata
extractionfollowingcorrectlytherelationsandsometimesdoingcalculations.Forexample,aquestionsuch
WhichexamshasJackpassed?orHowmanystudentshaseachsecretaryincharge?.
ToproduceaselectqueryinAccesswesimplyusethecommandCreateQueriesQueryWizard17.This
guidesusthroughastepbystepprocedure,wherewe:
choosethetablesandthefieldsfromwhichdataaretaken.Ifdatacomefrommorethanonetable,
Access automatically takes into consideration the existing relations. Data can also be taken from
otherqueries;
givethequeryaname.Thequery,beinganobject,needstobesavedinsidethedatabasefile.
ThequerycanthenbeopenedinDatasheetViewtoaccessthedata,payingattentiontothefactthat,as
usual,everydatummodificationisautomaticallyreflectedinthecorrespondingtables.
The query can also be opened in Design View, where we have full control over what is displayed in the
view.Intheuppersideofthiswindowweseetheinvolvedtableswiththeirrelationsandwecanaddother
tableswithrightclick Addtables.Inthelowersidewehavetheselectedfields,whichcanberemoved
withrightclick Delortowhichotherfieldscanbeaddedsimplydraggingthemfromthetablesabove.
Toafieldwecanalsoputasortingoption,ascendingordescending,oraShowoptiontodisplay/hidethe
field(obviouslyhidingmakessenseonlywhenthefieldisusedforsomethingelse,otherwiseitwouldbe
bettertoremoveitdirectlyfromthequery).
If the Query Wizard does not work, as it can happen in case of a wrong installation of Access 2010, the
querycanbecreateddirectlyinDesignViewclickingonCreateQueriesQueryDesign.
We can also use a criterion to filter out some records. For example, we can type in the criteria space
directlyavalue(enclosedinquotationsifthatfieldisatextualfield)andonlythedatahavingthisvaluein
thisfieldaredisplayed.Wecanalsoputmorecomplicatedconditions,suchasequalitiesandinequalitiesor
conditionsinvolvingotherfields.Forexample,tofilteroutunderagestudents,wecanputintheAgefield
criterion >=18, while to consider only students enrolled from 2013, we can put in the EnrollDate field
criterion>= #1/1/2013#.

For Access 2007 Create Other Query Wizard, for Access 2003, we select the Queries objects in the main
17

databasewindowandchooseBuildQueryusingWizard.

Version4.1(24/09/2014) Page19of44
PaoloColetti Databasescoursebook

However, when the condition becomes too complicated, it is better to use the Expression Builder (see
section2.4.2).Iftwocriteriaareputonthesamefield,typingthemontwospacesoneabovetheother,
theyautomaticallyarealternative(itisenoughthatoneofthembevalidforthosedatatobedisplayed);on
theotherhandiftwocriteriaareputondifferentfields,typingthemontwospacesofthesamerow,they
areconsideredtogether(bothmustbevalidforthosedatatobedisplayed).Again,complicatedconditions
withlogicaloperatorsaremoreeasilybuiltwiththeExpressionBuilder.

Virtual fields
Other fields can be automatically generated taking values from any field, even fields not present in the
query (but their tables must be present in the querys window upper part), and applying mathematical,
logicalortextualoperations.Wesimplytypeinthefieldsnamespaceofanexistingquerythevirtualfields
namefollowedbyacolonandbyitsexpression,usingtheExpressionBuilder(seesection2.4.2)ortyping
fields names involved in the operation enclosed in square brackets. For example, to build virtual field
ToPaywhichcontainsthepriceofanorderwithasingleproduct,wesimplytypeinthefieldsnamespace
ToPay: [Price] * [Quantity].

If we instead type in the names space or in the criterias space something between square parenthesis
whichdoesnotcorrespondtoanyfieldinthepresenttables,Accessstopstheexecutionofthequeryand
asks usthevalue ofthatthing. This is a goodtrick toforceAccessto askthe user avalue.For example,
putting in the criteria space of the EnrolmentYear field [Please, tell me the enrolment year] and switching to
DatasheetView,forcesAccesstostoptheexecutionofthequery,realizethatafieldwithnamePlease,tell
me the enrolment year does not exist, and display a dialog box which says exactly Please, tell me the
enrolmentyear.

2.4.2. Expression Builder (level 1)


TheExpressionBuilderisapowerfultooltobuildexpressions,whichcanbecalledclickingonthethreedots
anytimethereistheneedforanexpression,e.g.validationrules,tablevalidationrules,queryscriteriaand
querysnewfields.Ifthethreedotsarenotpresent,clickonthemagicwandbutton .
TheExpressionBuilderhasamainspacewheretheexpressionappears.Itcanbedirectlytypedinoritcan
be built using the mathematical and logical operators presented below. If it is possible (i.e. we are not
insideafieldsvalidationrule),otherfieldsvaluescanbeinsertedintheexpressionchoosingthemfrom
themenusbelow:theyappearastheirfieldsnameenclosedinsquareparenthesis,sometimespreceded
bythetablesnameifthesamefieldsnameisusedinmorethanonetable.Alsomanyfunctionscanbe
inserted,themostinterestingbeing:
Datetogetthecurrentdate;
DateDifftogetthedifferenceinmonthsoryearsbetweentwodates;
DateAddtogetadateplusorminusacertainamountofmonthsoryears;
Yeartogettheyearfromadate,MonthtogetthemonthandDaytogettheday;
theusualmathematicalfunctionsAbs,Exp,Sqr,Log,Int;
Like,whichletstheusercheckwhetherthevaluecorrespondstoanexpressionusingwildcards ?
(anycharacter)and*(anyamountofcharacters).Forexample,tocheckthataZIPcodehasexactly5
charactersandstartswith39,wecanusethecondition Like 39???.Tocheckthatanemailaddress
isreasonable,wecanusetheconditionLike *@*.*.

2.4.3. Summary query (level 3)


Asimplequeryisabletoperformoperations,butonlyactingwithinthesamerecordoftheviewandisnot
abletoperformoperationsinvolvingdifferentrecords,suchassumsoraverages.Forexample,Howmany
studentshaseachsecretaryinchargein2009?orWhatistheaveragestudentsgradethisyearforeach
country?.

Page20of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

To do this,weneed a summary query:wecreate asimple querywiththe involved fields,thenwepress


Show/HideTotalsbutton18andanewoptionappearsinthequerysfield.Thisoptionisoriginallysetto
GroupBy,meaningthattheviewwillnowbesqueezedtryingtogroupidenticalvaluesofthosefields.If
someviewsrecordshavethesamevaluesinallthefieldswhichhavetheGroupByoption,onlyasingle
record appears in the view. This feature used alone can be applied to some rare situations (and, on the
otherhand,causessomeproblemswhenthesummaryqueryiswronglychoseninsteadofasimplequery),
butusuallyitworksinconjunctionwiththeselectionofanotheroptionforonefield,suchasSum,Countor
Avg. In this case, the query tries to group using the fields with the Group By and, for every group, it
calculatesthesum,averageorcountofthevaluesofthefieldwiththeotheroption.
Forexample,ifwehaveadatabasewithStudentstablewithCountryfieldandwehaveanExamstablewith
Gradefieldandwewanttocalculatetheaveragegradeforeachcountry,weneedtocreateaquerywith
thesetwofields,convertittosummaryquery,andthenselectAvgoptiontotheGradefieldwhileleaving
GroupByoptiontotheCountryfield.TheresultisaviewwithfieldCountrywiththelistofcountries,each
one appearing only once since the results are grouped by countries, and field Grade (sometimes called
insteadAvgofGrade)withtheaverageofthevaluesoftheGradefieldsforstudentsinthatcountry.
ItisveryimportanttorememberthateveryfieldpresentinthequerywiththeGroupByoptionisusedto
group,evenwhenthisfieldhasthenoshowoptionorwhenthisfieldissimplyusedforacriterion.Awrong
grouping,withextrafields,clearlyproducesmorerecordsintheviewthanwhatshouldbe.Criteriainfields
withGroupByoptioncanthusleadtoproblemsandthereforeitisalwayssuggestedtoswitchtheGroupBy
optiontoWherewheneverafieldisusedonlyforfilteringandnotforgrouping.
Forexample,ifinthedatabaseabovewewanttocalculatetheaveragegradepercountryin2008,weneed
tobuildasimplequerywiththefieldsCountry,GradeandDate.Thenweconvertittosummaryquery,we
select Avg option in the Grade field and we select Where in the Date field putting as criterion Between
#1/1/2008# and #31/12/2008# andweunselecttheShowoption.IfweinsteadleaveGroupByoptioninthe
Datefieldwithnoshowoptionandwiththiscriterion,theresultisgroupedbycountryandbydatewhich
meansthattheviewshowstheaveragepercountryperexamsession.

2.4.4. Non selection queries (level 9)


Whileselectionqueriesextractdatafromthedatabase,thereareotherquerieswhichactivelymodifydata
inthedatabase.ThebestwaytohandlethesequeriesistobuildthemasselectionqueriesinDesignView
and Datasheet View, then from the Query Type tab19switch them to the appropriate query type, check
carefully in Design View that the query is doing exactly what is wanted and then finally press Results
Run20todochangestothedata.
Make Table query converts the view into a real independent table. From now on those data are
independentfromthedatacontainedintheoriginaltables.
Update query selects records and can assign a value to the present fields through a new space that
appears.
Deletequeryselectsrecordsanddeletesalltherecordsinvolved(andnotonlythecontentofthepresent
fields)fromtheircorrespondingtables.
Appendqueryappendstherecordsoftheviewtoanexistingtable.Obviouslytheviewsfieldsmustmatch
exactlythefieldsofthetable.

18
InAccess2003itissimplythebutton.
19
InAccess2003fromtheQuerymenu.
20
ForAccess2003QueryRun.

Version4.1(24/09/2014) Page21of44
PaoloColetti Databasescoursebook

2.5. Reports (level 3)


Areportisawaytopresentdatabasesdatainagoodlookingformat.
To produce a report in Access we simply use the command Create Reports Report Wizard21. This
guidesusthroughastepbystepprocedure,wherewe:
choosethetablesorthequeriesandtheirfieldsfromwhichdataaretaken;
choosewhetherdatamustbegroupedornot.Forexample,studentscanbegroupedbycountriesit
the Country field is selected. If data from several tables are involved, Access always suggests a
groupingbasedontherelations1side.Oncethefirstgrouping isselected,Accessaskswhether
furthergroupingisrequired.Forexample,studentscanbegroupedbycorrespondingsecretaryand
then,insideeachgroup,groupedagainbycountry;
choosewhetherdatamustbesorted;
choosethelayoutofthereport;
choosethestyleofthereport;
givethereportaname.Thereport,beinganobject,needstobesavedinsidethedatabasefile.
ThereportisautomaticallyopenedinPrintPreview22.UnfortunatelyclosingPrintPreviewalsoclosesthe
report23.Thustobeopenedinanotherview,thereportshouldberightclickedtochooseReportViewto
lookatthefinalresultorDesignViewtochangeitslayoutandstyle,toaddandremovefields,titles,header
and footer. Grouping and sorting may be modified clicking on Design Grouping & Totals Groups &
Sort.
ReportscanbeexportedfromAccessinRTFformatfromDataMoreWord24.
If the Report Wizarddoesnot work,asit can happen in caseof a wrong installationof Access 2010, the
formcanbecreatedwithoutusingthewizardclickingfirstonthemaintablethatistobeusedandthenon
Create Reports Report.Thiswillcreateaformwithallthefieldsofthattableandtheunnecessary
onescanbelaterremovedinDesignView,togetherwithgroupingsandsortingsmodifications.

21
In Access 2003, we select the Reports objects in the main database window and choose Build Report using
Wizard.
22
InAccess2003itisopenedautomaticallyinReportView.
23
ThishappensonlyinAccess2010.ForAccess2003and2007,closingthePrintPreviewjumpstoanotherview.
24
ForAccess2003and2007DataWord
Page22of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

3. MySQL (level 5)
MySQL is a free SQL database management program, a program which is in charge of storing data and
lettingusersextract,modify,insertthemusingSQLlanguage(seesection0).
UnlikeMicrosoftAccess,MySQLprogramrunsasaserveronacomputerandreceivesSQLconnectionson
port3306.Theseconnectionsmaycomefromthesamecomputerorfromexternalonescalledclients,in
caseport3306isopenforexternalconnections.Toeachclientrequest,MySQLserverreturnsananswer
always intextual form which may be atable, in case of data request, a simple acknowledge in case ofa
commandrequestoranerrormessage.
UnlikeMicrosoftAccess2010,MySQLusesuserauthentication,i.e.theusermustprovideavalidusername
andpasswordtoopenaconnectionfromhisclienttoMySQLserverandmusthavetheprivilegesforthe
operation he is trying to do. For example, an user without the ALTER TABLE privilege will be unable to
changethestructureofatable.

3.1. HeidiSQL
Being a server, MySQL requires a client program to connect on the
usersside.Thesimplestclientprogramisacommandlineinterface,
as the one in the picture, where the user types its commands on a
cursorline.However,thiskindofclientisnotuserfriendly,especially
whenextracteddatahavetheformofacomplextable.
HeidiSQL is one of the many graphical clients for
MySQL server. It handles the connection
procedure and all the commands the user may
send to the database, displaying the result in a
more user friendly graphical format. It is freely
available on www.heidisql.com, but a
preconfigured portable version, where sessions
information are already inserted, can be
downloadedfromthecourseswebsite.
The first thing that HeidiSQL displays is the
Session Manager, where, when not using a
preconfigured portable version, we should build
yourmostcommonlyusedsessionsfillinginthese
details:
NetworkType:wechooseMySQL(TCP/IP);
Hostname/IP: we type here localhost if MySQL server is running on your current computer,
otherwisetheIPnumberor,better,theInternetNameofthecomputeronwhichMySQL server is
running(onthecourseswebsitethereareinstructionsonhowtoconnecttounibzMySQLserver)
flag Prompt for Credentials, unless we want the client to remember our username and our
password;
Port:3306;
wedonotflagCompressedclient/serverprotocol.
Wesavethesessionwehavejustbuiltandweassignameaningfulname.ThenweclickOpentoopenthe
connection.
Remember that, in order to connect to MySQL server at unibz from outside unibz's LAN, we must first
activate Virtual Private Network on our computer, otherwise the connection will be rejected by unibz

Version4.1(24/09/2014) Page23of44
PaoloColetti Databasescoursebook

firewall as untrustworthy. Instructions on how to set up VPN are on


http://www.unibz.it/en/ict/ComputerInternet/network/vpn.
Once inside HeidiSQLs interface we find on the left the databases structures, on the top right we may
navigatethroughtheobjectsofthecurrentchosendatabaseandhavealookattheirstructureandtheir
content(inthepicture:currentdatabaseiscalledsakila,currenttableisactor).Onthelowerwindow
weseethemostrecentcommandsthatwesubmittedtothedatabaseandthedatabasesacknowledgesor
errors,whiletheresultofourcommandsaredisplayedinthecentralrightwindow.
ClickingontheupperQuerytabweareabletotypedirectlyqueriesinSQL.TheyarerunpressingF9orthe
Runbutton andweseeasusualtheacknowledgeortheerrorinthelowerwindowandtheresultin
thecentralrightwindow(inthepicture:currentqueryisinblueandpinkselect*fromactor;,firstresult
isinblue,greenandred1PENELOPEGUINESS2006021504:34:33,databaseacknowledgeisingrey/*
0rowsaffected,200rowsfound.Durationfor1query:0,000sec.*/).


3.1.1. Connection not working
Ifweareconnectingtounibzserverfromanunibzsinternalcomputer:
wemistypedusernameorpassword;
ouruserdoesnothaveanaccountonthatdatabase;
MySQLserverisdown.
Ifweareconnectingtounibzserverfromanexternalcomputer:
wearenotusingVirtualPrivateNetwork(seeabove,section3.1);
wemistypedusernameorpassword;
wemistypedunibzsserversInternetname;
ouruserdoesnothaveanaccountonthatdatabase;
MySQLserverisdown;

Page24of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

wearenotconnectedtotheInternet;
ourfirewallpreventsoutgoingconnectionsfromport3306;
If we are connecting to another server on your same computer: in this case we should try to use the
commandlineinterfacetocheckwhetherserverisupandourpasswordworks,asdescribedattheendof
procedureinsection3.2).
Ifweareconnectingtoanotherexternalserver:
wemistypedusernameorpassword;
wemistypedtheserversInternetname;
ouruserdoesnothaveanaccountonthatdatabase;
MySQLserverisdown;
our user does not have the privilege to connect to MySQL server from outside (a check on which
computerweareisperformedeverytimeweconnectbyMySQLserver);
wearenotconnectedtotheInternet;
ourfirewallpreventsoutgoingconnectionsfromport3306;
thefirewallonwhichMySQLisrunningpreventsincomingconnectionsonport3306.

3.2. Installing MySQL server


WhileitisnotnecessarytoinstallaMySQLserverasthereareothersavailabletodoexercises,itmaybea
convenientchoicetoinstallalocalMySQLserverwhichwillallowourclienttomakefasterconnectionsand
workalsowithoutInternetconnection.
Thebestwaytodoitisdownloadingfromwww.mysql.com MySQLCommunityserverthroughWindows
Installer,whichsimplifiesthedownloadingandinstallationprocess.
Runtheinstallerand
1)wechooseInstallMySQLProducts 2)weacceptthelicense

Version4.1(24/09/2014) Page25of44
PaoloColetti Databasescoursebook

3)ifwehavedownloadedthelastversion,wemay 4)wechooseCustomsetuptype;
skipthedownloadofupdates;

5) the suggested features are MySQL Server, 6) according to what we have selected in the
MySQLNotifierandMySQLDocumentation; previous step, the installer might ask us the
permissiontoautomaticallyinstallotherproducts;

7)wereviewthelistofprogramstobeinstalledand 8) we configure the server (Development


wepressExecute; Machine, port 3306, Open Firewall port for
networkaccess,ShowAdvancedOptions)

Page26of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

9) we choose a very difficult root password and 10)wecreateanewuserwhichwillbeus(AllHosts


writeitdowninasafeplace.ThenweclickonAdd (%), Role DB Manager) and we provide a
User; password;

11) we assign a name to our server and we let it 12)weassignanametoerrorlogfileandwechoose


start at Startup (unless we plan to run it ourselves whetherwewantotheractivitiestobelogged;
whenever we need it) with Standard System

Account;

Version4.1(24/09/2014) Page27of44
PaoloColetti Databasescoursebook

13) in our All Programs list we should see a 14)weruntheCommandLineClientandwetype


MySQLfolderwiththeCommandLineClient; rootpassword.


Weshouldnowbeinsidethedatabaseasroot,with
all privileges. We check which databases we have
typingthecommandSHOWDATABASES;.
Sakila database should already be present inside the server. To load other databases, the fastest way is
loadingtheSQLcodeonthecourseswebsitewhich,onceexecuted,buildsthemautomaticallyandfillsthe
tables.

Page28of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

4. SQL language for MySQL (level 5)


SQL is a structured query language for extracting and modifying data and managing databases. Its
commandsaretypeddirectlyintoMySQLclientandexecuted.Inparticular,usingHeidiSQLwetypethemin
theQuerywindow andwerunthempressingF9ortheRunbutton .TheQuerywindowcanbe
savedasaSQLfilepressingCTRL+SortheSaveSQLtoFilebutton .AnewQuerywindowcanbecreated
pressingontheappropriatetab .Theenteredcommandsarerepeatedinthelowestwindow,together
withtheMySQLsacknowledgeorerrormessage.Alsothiswindowcanbesavedrightclickingthemouse
onitandchoosingSaveastextfile.
SQL commands can be entered with keywords written in capital or small caps, it does not matter.
Traditionallycapitallettersareusedforkeywordstomakethecodemorereadable.Theymustalwaysend
withasemicolon.
In HeidiSQL pressing F1 while a keyword is highlighted calls the SQL help, where we get a detailed
descriptionofthesyntax.

4.1. Basic operations


The first SQL command to operate with a database is USE {database}; for example to use the database
Sakila USE sakila;. In HeidiSQL we simply click on a database on the left window and a USE {database};
commandisimmediatelyexecuted.
Oncethedatabaseischosen,activitiesonthisdatabasestablescanbedone.However,inordertoperform
them,theusermusthaveproper authorization.Adifferentauthorizationcanbe assignedtotheuseron
eachdifferenttableofeachdatabase,butusuallytheadministratorassignsthesameauthorizationonall
thetablesofthesamedatabase.AuthorizationsincludeGRANT SELECTtoperformselectqueries,GRANT
INSERTtoinsertdata, GRANT UPDATEtomodifydata, GRANT DELETEtodeletedata, GRANT ALTERto
modifythetablesstructure.
Special care must be taken before performing operations which modify the data or the structure. While
some versions of MySQL database offer the possibility to undo the last performed operation using
ROLLBACK; command, many others do not. Thus, while simply looking at data or performing selection
queriesdoesnotcauseanyharm,wemustthinktwicebeforemodifyingdataorchangingthestructureofa
table.

4.2. Simple selection queries


Aqueryisaquestionposedtothedatabase,whichanswerswithatemporarytableofdata.InHeidiSQLthis
temporarytableisdisplayedimmediatelybelowthewindowwherethecommandisentered.
Thequestioncanbe,forexample,Whichstudentsarebornin2007?orWhoaretheGermanstudents,
which is their address, and what are their grades averages?. The result would be in these cases a
temporarytablewithasinglecolumnwiththestudentsnumbers,oratemporarytablewithfourcolumns
withstudentsname,studentssurname,studentsaddressandtheaverageofthegradesofthatstudent.
The result therefore contains all the values corresponding to the fields selected in the query, organized
followingtheunderlyingrelationsaccordingtothequerysindication,sortedandfilteredaccordingtothe
querysindicationandtogetherwithvirtualfieldscreatedusingformulascontainedinthequery.Itisthusa
powerfultooltoextractinformationfromthedatabase.
Theselectionqueryisthebasictypeofquery,adirectquestiontothedatabasewhichinvolvesonlydata
extractionfollowingcorrectlytherelationsandsometimesdoingcalculations.Forexample,aquestionsuch
WhichexamshasJackpassed?orHowmanystudentshaseachsecretaryincharge?.

Version4.1(24/09/2014) Page29of44
PaoloColetti Databasescoursebook

ThebasicSQLsyntaxforaselectionqueryis
SELECT {fields} FROM {table};
where{fields}isalistoffieldsseparatedbycommasand{table}isthenameofatable.Forexample,
SELECT FirstName, LastName, BirthDate FROM Students;
produces a temporary table with 3 columns extracted from permanent table Students, which probably
containsmanymorefields.
Toaddafilteringorsortingconditionwejustneedtoadd,afterthetablesnameandbeforethesemicolon,
WHERE {condition} ORDER BY {field} {ASC|DESC}
Forexample,
SELECT FirstName, LastName FROM Students WHERE ( Enrolment_Date >= 2013-01-01 ) ORDER BY
LastName ASC ORDER BY FirstName ASC;
producesalistofstudentsenrolledfrom2013,orderedfirstbysurnameand,incasetwostudentshavethe
samesurname,byfirstname.
Query
SELECT FirstName, LastName, Residence_address, Residence_Country FROM Students WHERE ( (
Enrolment_Date >= 2013-01-01 ) AND ( Residence_country = Germany ) );
produces a list of students with their addresses enrolled from 2013 and resident in Germany, in no
particularorder.Notethattheconditionmayinvolveafieldwhichisorwhichisnotinthequeryfields,but
the field must obviously be in the selected table. Spaces in the condition and parenthesis around the
condition are not mandatory, but improve readability and are very helpful when conditions become
complicated.
Thereareotheroperatorsandfunctionswhicharehelpfulinconditions:
theusualmathematicaloperators +,-,*,/
theusualmathematicalcomparisons=,<,>,<=,=>,<>
{field} BETWEEN {value1} AND {value2},forexamplePrice BETWEEN 20 and 50
{condition1} AND {condition2},forexample( Price > 50 ) AND ( Quantity < 20 )
{condition1} OR {condition2},forexample( Price > 50 ) OR ( Quantity < 20 )
NOT {condition},whichreversestheconditionafterit,forexampleNOT (Country = Germany)
IS NULL and IS NOT NULL, to check whether a value is empty or not, such as (End date IS NOT
NULL)
{field} IN ({list}), very useful when you need to filter in values in a list, for example Country IN
(Germany,Italy,Austria,France)
{field} LIKE {expression}, for approximated matching filters, such as Surname LIKE Col%, which
selectsonly surnamesstarting with Col,or Surname LIKE Col_t which selectsonly surnamessuch
asColet,Colit,Coltt,Colt,Col4t,etc.
CURDATE(), to be used in expression where current date (with midnight time) is needed, for
example to select students enrolled today Enrolment_Date = CURDATE(). Pay special attention
whenyoucomparethisfunctionwithadateincludingtime,asthisfunctiongivesyouthecurrent
date with midnight time and can give you the wrong result; for example Enrolment_Time <=
CURDATE()doesnotreturnpeopleenrolledtoday
DATE_ADD({date},{interval})toadd(orsubtractwhentheintervalisnegative)days,months,years
fromadate,forexampleDATE_ADD('2012-01-09', INTERVAL 14 MONTH)resultsin2013-03-09
YEAR({date})togettheyearfromadate,suchasYEAR(Enrolment_Date) = 2012
MONTH({date})togetthenumericalmonthfromadate,suchasMONTH(Enrolment_Date) = 9
DAY({date})togetthedayofthemonthfromadate,suchas DAY( CURDATE() ) = 1,whichistrue
onlywhentodayisthefirstdayofthemonth,
unfortunatelySQLdoesnothave,asAccess,afunctiontodirectlycalculatethedifferenceinyears
between two dates, since function DATEDIFF() returns the difference in days and not in years.
Therefore,thetwowaystogetthedifferenceinyearsbetweentwodatesistheapproximateway
ROUND(DATEDIFF({date2},{date1})/365.25) and this trick for the exact difference YEAR({date2}) -
YEAR({date1}) - ( DATE_FORMAT( {date2}, '%m%d' ) < DATE_FORMAT( {date1}, '%m%d' ) ).

Page30of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

4.2.1. Virtual fields


Newvirtualfieldsfortheresultingtemporarytablecanbeautomaticallygeneratedtakingvaluesfromany
field,evenfieldsnotpresentinthequery(buttheymustbepresentinthetablesusedbythequery),and
applyingmathematical,logicalortextualoperations.WesimplywriteASaftertheexpression:
SELECT {expression} AS {name} FROM {table};
For example, to build virtual field ToPay which contains the price of an order with a single product, we
simplytypeinthefieldslist
SELECT ProductName, Price * 0.9 AS Discounted_Price FROM Products;
Tobuildavirtualfieldasasumoftwootherfields
SELECT StudentID, Tax_First_Semester+Tax_Second_Semester AS TotalTax FROM Taxes;

4.2.2. Views
Incaseweplantousethequeryagaininthefutureorwesimplywanttosaveitinsidethedatabaseorwe
plantouseitstemporarytableasasourcetableforanotherquery,itisagoodideatomakeitpermanent
creatingaviewwith
CREATE VIEW {name} AS {query};
Forexample,
CREATE VIEW List_Of_Discounted_Products AS SELECT ProductName, ProductDescription, Price * 0.9 AS
Discounted_Price FROM Products;
Thisquerycanbeusedlaterbyanotherqueryasasourcetable,forexample
SELECT ProductName, Discounted_Price FROM List_Of_Discounted_Products ORDER BY ProductName ASC;
Todestroyaview,usuallybecausewewanttorebuilditinanotherway:
DROP VIEW {name};

4.3. Inner joins


Ifwewanttotakevaluesfromtwotables,SQLlanguageisnotabletocorrectlyfollowtherelationsunlessit
isexplicitlyinstructedtodoso.ThisisduetothefactthatSQLisagenericprogramminglanguagewhich
worksforeverydatabasemanagementprogram,eventhosewhichdonothandleautomaticrelationsnor
referentialintegrity.Therefore,weneedtoremindSQLwhichrelationsareinvolvedandhowtheywork.
Thesyntaxis
SELECT {fields} FROM {table1} INNER JOIN {table2} ON {field1} = {field2};
where{field1}and{field2}aretheforeignandprimarykeyofrelatedtables,while{fields}arethenamesof
theselectedfieldsfrom{table1}and{table2}.
Forexample,
SELECT Name, Surname, Number FROM People INNER JOIN Telephones ON Person_code=Owner;
or
SELECT Name, Surname, `Arrival position` FROM Drivers INNER JOIN Participants ON `Tax code`=Driver;
Incasethesamefieldsnameisusedinbothtables,fieldsnamesmustbeprecededbytablesnameanda
dot.Forexample,
SELECT Secretaries.FirstName, Secretaries.LastName, Students.LastName FROM Students INNER JOIN
Secretaries ON Secretaries.Code = Students.Secretary;
Inthiswayitismuchclearerwhereeachfieldcomesfrom.
Obviously to these queries filtering conditions with WHERE, sorting conditions with ORDER BY and
renamingoffieldswithAScanbeused.
Moreover,sincesometimesthetablenamemightberatherlongandboringtorewriteitseveraltimes,the
ASoperatormaybeusedalsotorenameatable,suchas
SELECT b.FirstName, b.LastName, a.LastName FROM Students AS a, INNER JOIN Secretaries AS b ON a.Code
= b.Secretary;

Version4.1(24/09/2014) Page31of44
PaoloColetti Databasescoursebook

4.3.1. Cross joins


Acrossjoinisaqueryinvolvingseveraltableswithoutperforminganyjoin,suchasusingtablesinsection
1.2.2
SELECT People.Name, People.Surname, Telephones.Number FROM People, Telephones;
Theresultistheglobalcombinationofallthefieldsofallthetables.Intheexamplescase,ahugetable
withJohnSmithandallthe19telephonenumbers,thenJohnMcFlurryandallthesame19telephone
numbers,etc.Thisisveryrarelywhatiswantedfromthedatabase,eventhoughsometimesitmightbe.

4.3.2. Multiple inner joins


Whenthreeormoretablesareinvolved,theremustbeacompoundinnerjoin,evenwhenatable(typically
ajunctiontable)doesnotuseanyfield.Usingtheexampleinsection1.5,
SELECT Houses.SquareMeters, Owners.LastName FROM ( Houses INNER JOIN PropertyActs ON Address =
PropertyActs.House ) INNER JOIN Owners ON PropertyActs.Owner = TaxCode;
Eventhoughtheinnerjoinseemsmoredifficulttouse,itexactlydescribeswhatthequeryisdoing:taking
allthreetablesandsqueezingtheircontentrelationbyrelationuntilasingletableisreached.
Anotherexampleusingthedetailstableinsection1.5.1,writteninseverallinesjusttoimprovereadability,
SELECT c.Surname, c.Name, p.ProductName, od.Quantity, p.UnitPrice FROM
( ( ( Orders INNER JOIN order details AS od ON Orders.OrderID = od.OrderID )
INNER JOIN Products AS p ON od.ProductID = p.ProductID )
INNER JOIN Customers AS c ON c.CustomerID = Orders.CustomerID );
As is it evident, using the same field name for the primary and the foreign key makes detecting the
relationsmucheasier.A
Anothergoodtrickisavoidinganystrangecharacterandanyspaceintablesandfieldsnames,otherwise
theymustbeenclosedinthespecialquotationsymbolgraveaccent`.25Wemusttakespecialcarethatthe
symbolisnottheusualapostropheorsinglequotation ,whichinsteadisusedforvaluessuchastextand
dates,buttheoneintheoppositedirectionasthegraveaccent.Forexample,todisplaysallthehousesand
theircurrentowners(forwhomEnddateiseitherNulloritisinthefuture)
SELECT Address, `Square meters`, p.Percentage, Owners.Name, Owners.Surname FROM
( ( Houses INNER JOIN `Property Acts` AS p ON Houses.Address=p.House)
INNER JOIN Owners ON Owner=`Tax code` )
WHERE ( ( p.`Begin date` <= CURDATE() ) AND ( ( p.`End date` IS NULL ) OR ( p.`End date`>CURDATE() ) );

Difference between WHERE and ON


Theoretically an inner join query could be rewritten as a cross join table using WHERE condition and
producethesameresult.Forexample
SELECT Name, Surname, Number FROM People INNER JOIN Telephones ON Person_code=Owner;
and
SELECT Name, Surname, Number FROM People, Telephones WHERE Person_code=Owner;
producesexactlythesameresult.Thesecondquery,however,performsfirstacrossjoinwhichproducesa
very large table and then reduces it with a filter. While for small amounts of data this is not a problem,
whenthenumberofinvolvedtablesandrecordsincreases,aqueryperformedinthiswayisinefficientand
canbeveryslow.Moreover,thelatterqueryismisleadingfromatheoreticalpointofviewandcanconfuse
userwhomustcheckandmodifyitfurther.

25
Incasethissymboldoesnotappearonthekeyboard,inHeidiSQLanytableorfieldnamecanbeeasilyproduced,
together with the enclosing symbol, double clicking on the tables or fields name. Producing it directly withgout
copyingandpastingitfromanotherpartofthecodeisrathercomplicated:wepressWindowskey+R,thenwetype
charmap,thenwegothroughthecharactermapuntilwefindthegraveaccent,wecopyandpasteit.
Page32of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

4.4. Summary queries


Asimpleselectionqueryisabletoperformcalculationsoperationstocreatevirtualfields,butonlyacting
record by record, while it is not able to perform operations involving different records, such as sums or
averages.Forexample,Howmanystudentshaseachsecretaryinchargein2009?orWhatistheaverage
studentsgradethisyearforeachcountry?.
Todothis,weneedasummaryquery.Theirsyntaxisexacltythesameofthesimplequery,withthetwo
differencesthatanaggregationfunction(Sum,Avg,Max, Min,Count(*), Count(DISTINCT {field}))isappliedto
the field on whicharecord by record operationmustbe done andthat a GROUP BY {fields}isoptionally
appendedattheendoftheSQLcommand.ScopeoftheGROUPBYistoaggregatetherecordsalltogether,
endingupwithonerecordpervalueofthegroupingfields.
Forexample,ifwehaveadatabasewithStudentstablewithCountryfieldandwehaveanExamstablewith
Gradefieldandwewanttocalculatetheaverageandminimumgradeforeachcountry,theSQLis
SELECT Country, Avg(Grade), Min(Grade) FROM Students INNER JOIN Exams ON Students.StudentNumber =
Exams.StudentNumber GROUP BY Country;
Theresultwouldbeatemporarytablewiththreecolumns,thefirstonecontainingthecountries,theother
two containing the average and minimum grades (calculated on all exams and on all students) of the
studentsofthatcountry.Thesummaryquerysqueezesalltherecordswhichhavethesamevalueforthe
GROUP BY fields and onlyonerow for eachdifferentvalue ofthegrouped fields appears. The summary
querycanusealsoseveralfieldsforgrouping,suchas
SELECT Country, Avg(Grade), Min(Grade) FROM Students INNER JOIN Exams ON Students.StudentNumber =
Exams.StudentNumber GROUP BY Country GROUP BY `Degree course`;
Whenever the Count function is used, it can be applied on a field, as in the previous example, or it can
countthenumberofrecordsusinganasterisk,asin
SELECT Country, Count(*) FROM Students GROUP BY Country;
whichcountsthestudentsforeachcountry,while
SELECT Country, Count(DISTINCT Surname) FROM Students GROUP BY Country;
countsthesurnames,i.e.identicalsurnameswillbeconsideredasonesurnameandcountedonlyonce.
Filteringconditionsmaybeappliedtosummaryqueries.However,wehavetwodifferentoptionsthistime:
theusualsyntax WHERE {condition}whichfiltersbeforetheaggregationandthusmustbeusedon
fieldswhichwillnotexistanymoreafteraggregation.ItmustalsobewrittenbeforetheGROUPBY
instruction;
HAVING {condition},writtenafterGROUPBYinstruction,whichfileteraftertheaggregationandthus
mustbeusedonfieldswhichappearonlyafter,forexampleonaggregationfunctions;
bothconditionscanbeusedonfieldswhichremainthesameduringaggregationprocess,tipically
fieldswhichareusedforgrouping.
Forexample,tocalculatetheaveragegradepercountryin2013
SELECT Country, Avg(Grade) FROM Students INNER JOIN Exams ON Students.StudentNumber =
Exams.StudentNumber WHERE ( YEAR(Exams.Date) = 2013 ) GROUP BY Country;
buttocalculatetheaveragegradepercountry,consideringonlycountriesstartingwithG,
SELECT Country, Avg(Grade) FROM Students INNER JOIN Exams ON Students.StudentNumber =
Exams.StudentNumber GROUP BY Country HAVING ( Country LIKE G% );
orequivalently,
SELECT Country, Avg(Grade) FROM Students INNER JOIN Exams ON Students.StudentNumber =
Exams.StudentNumber WHERE ( Country LIKE G% ) GROUP BY Country;
To display the countries which have an average grade above 26, since the condition is on a summary
operation,thecommandcanonlybe
SELECT Country FROM Students INNER JOIN Exams ON Students.StudentNumber = Exams.StudentNumber
GROUP BY Country HAVING ( Avg(Grade) > 26 );

Version4.1(24/09/2014) Page33of44
PaoloColetti Databasescoursebook

4.5. Modifying records


Thereareusefulcommandswhichcanmodifytheexistingrecordsofonetable,providingthattheuserhas
theappropriateupdate,insertordeleteprivileges.Eventhoughthesequeriescanmodifyalsorecordsfrom
severaltablesatthesametime,itisneveragoodideatodoit,asthequerybecomeslesscontrollable,and
itismuchsafertodoitononetableatthetime.RememberalsothatinmanyversionsofMySQLitisvery
difficulttoundochanges(seesection4.1).
Todeletealltherecordsofatable,keepingitsstructure,thesyntaxis
TRUNCATE {table};
Todeletesomerecordsfromatablethesafestwayisselectingthemmanuallythroughagraphicalclient,
likeHeidiSQL.However,ifthetablecontainsmanyrecordsanditisdifficulttospotonebyonetherecords
whichneedtobedeleted,itcanbefasterandevensafertouseadeletequerywithsyntax
DELETE FROM {table} WHERE {condition};
It is a good idea to first run this query as a selection query, replacing DELETE with SELECT * (or with
SELECT COUNT(*)iscaserecordsaremany),andthen,ifeverythingiscorrect,runthedeletionquery.

To modify records in a table it is, as in the previous case, a good idea to manually edit them using a
graphicalclient,butwhenalotofsimilarmodificationisrequested,thesyntaxthatcanbeusedis
UPDATE {table} SET {field} = {value} WHERE {condition};
Forexample,tosetlanguageGermantoallstudentswithresidencecountryGermany,thecommandis
UPDATE Students SET language = German WHERE residence_country = Germany;
Anupdatequeryisoftenappliedtoonetablebuttheconditionusesfieldsfromotherrelatedtables.Inthis
caseaninnerjoininnecessary,andthesyntaxbecomesslightlymorecomplicated
UPDATE {table} SET {field} = {value} FROM {tables with inner join} WHERE {condition};
For example, to set the grade of all exams to 30 (in table Exams) for students whose surname (in table
Students)startswithCol,thecommandis
UPDATE Exams SET grade = 30 FROM ( Exams INNER JOIN Students ON exams.StudentNumber =
Students.StudentNumber ) WHERE surname LIKE Col%;
Toinsertrecordsinatablethefastesttoolisimportingthemfromatextfile,asexplainedinsection4.6,or
insertingthemmanuallyrecordbyrecordinagraphicalclient.However,thesyntaxtomanuallyinsertthem
usingaqueryis
INSERT INTO {table} ({fields}) ({values}), , ({values});
Forexample,toinsertthreerecordsintableStudents
INSERT INTO Students ( StudentNumber, FirstName, Surname, BirthDate, EnrolmentYear ) ( 8712, John,
Smith, 1994-04-01, 2012 ), ( 8713, Vanda, Black, 1994-12-10, 2011 ), ( 8717, Mary,White, 1995-05-01,
2012 );
It is not necessary to specify all the fields of the table: any field not specified in the {fields} list will be
assignedanautomaticnumberifithastheAUTO_INCREMENToption(seesection4.7.1),otherwiseMySQL
willassignaNullvalue.

4.6. External data


Whenusingagraphicalclientitiseasytoimportandexport
externaldataintextualform.Thestandardformatisaplain
text file in CSV (commaseparated values) format, with
usuallysemicolonortabasseparator.
ToexportusingHeidiSQL,weopenthedatabase,weclickon
thetableintheleftwindow,weclickonDatasheetandwe
choosethemenuToolsExportgridrows.Adialogwindow
opens up and there are several selfexplanatory options,
among which the most delicate is the Field separator. The
typical choices are tab, indicated with \t, or semicolon,
howeveranyothercharacterisgoodprovidedthatitnever

Page34of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

appearsamongtheexportedvaluesotherwiseitwillcreateconfusion.Iffindingsuchacharacterisdifficult
since the table contains long textual fields, which may have any character including tab, then the best
choice is using tab but inserting also an Encloser character, usually single or double quote, which will
surroundalltheexportedvalues.
To import using HeidiSQL, two operations must be performed before beginning. First, the table must
alreadyexist;ifitdoesnot,wemustprepareitsstructurebeforeimporting,asexplainedinsection4.7.2.
Thenweneedtoopenthedatafilewithaplaintexteditor(suchasNotepad)andcarefullycheckwhichis
thefieldterminationcharacterandwhethervaluesareenclosedinquotationsornot.
Finally, we open the database, we click on the
table into which we plan to insert the data in
the left window, then we choose Tools
Import CSV file. It is better, to avoid possible
incompatibilities, to choose Client parses file
contentsasaMethod.Wechoosecarefullythe
Fields terminator and the Fields encloser, if it
exist in the file. In the fields list, we unselect
thefieldswhicharenotpresentinthetextfile
andwerearrangethefieldsincasetheyareina
differentorderinthetextfile.
The entire importing procedure is very tricky,
very often it needs to be repeated several
times, choosing REPLACE (duplicates) in the
further attempts, before a complete import
withouterrorsisreached.

4.7. Tables
EachtableinaMySQLdatabasehasalistoffields,eachonebeingabletostoreonlyaspecificdatatype.To
seethestructureofatableweusethesyntax DESCRIBE {table}.Moreover,inordertodiscoverthesyntax
whichhasbeenusedtoproducethestructureofatable,SQLoffersthesyntax SHOW CREATE {table}.In
this way it is possible, even by novice users, see an example of a similar table and partially copy it to
produce other structures using the following syntax, where newlines are inserted only for readability
purposes,
CREATE TABLE {table} (
{field} {field type} {options},

{field} {field type} {options},
PRIMARY KEY ({field}),
INDEX({field})
);
Forexample,toproduceatableStudents,
CREATE TABLE Students(
`student number` INTEGER PRIMARY KEY,
name VARCHAR(45) NOT NULL,
surname VARCHAR(45) NOT NULL KEY,
`enrolment date` DATE KEY,
Notes TEXT,
PRIMARY KEY ( `student number` ),
INDEX ( `surname` )
);
Theindexindicationisanindicationthatthatfieldisfrequentlyusedinsearch,sortingorjoiningoperations
and MySQL will prepare itself; every foreign key is obviously an index, while for primary key it is not

Version4.1(24/09/2014) Page35of44
PaoloColetti Databasescoursebook

necessarytoindicateitasitisimplicitintheprimarykeyindication.Concerningotherfields,forsomeitis
obvious that they must be indexes, for example field surname in every peoples table, for others the
decisionisuptothedatabasedesignedwhileforsomeothersitisevidentthattheyarenotindexes,such
asfieldsNotesorPicture.
Wecanaddsomecheckconstraintstoatabletolimitthepossibilityofwrongdatainsertions,addingto
thesyntax CHECK {condition}.Forexample,toadmitonlystudentswithanumberlargerthan1000andan
enrolmentdatenotinthefuture
CREATE TABLE Students(
`student number` INTEGER,
name VARCHAR(45) NOT NULL,
surname VARCHAR(45) NOT NULL KEY,
`enrolment date` DATE KEY,
Notes TEXT,
PRIMARY KEY ( student number ),
INDEX ( `surname` ),
CHECK ( `student number` >1000 ),
CHECK ( `enrolment date` <=CURDATE() )
);
Unfortunately,uptoversion5.6MySQLacceptsanyCHECKcommandbutdoesnotreallyimplementthe
CHECKconstraint!
Even though it is never a good idea to do it (with the exception of adding other fields, a rather safe
operation),tochangeatablestructureafterithasbeencreatedthefollowingsyntaxescanbeused:
DROP TABLE {table};
ALTER TABLE {table} ADD {field} {field type} {options};
ALTER TABLE {table} DROP {field};
ALTER TABLE {table} ADD PRIMARY KEY {field};
ALTER TABLE {table} DROP PRIMARY KEY;

4.7.1. Field types


EachfieldinaMySQLtablemusthaveapredefinedtype,accordingtowhatitisgoingtocontain.Themost
frequentlyusedtypesare:
INT,foraintegernumberfrom2billionsupto2billions
TINYINT,0or1,usedforbooleanvalues
DECIMAL ( {maximum total number of digits} , {number of decimal digits} )
FLOAT,foranonexactrealnumber
CHAR ( {number of characters} ),forfixedlengthtext
VARCHAR( {maximum number of characters} ),forvariablelengthtext
TEXT,forverylongtextupto60,000characters
ENUM( {value} , , {value} ), for text values only in the indicated list (warning: numbers must be
insertedastext;if 7isinserted,itisinterpretedasvalue7,butif 7isinserted,itisinterpretedas
the7thvalueofthevalueslist)
DATE,foradate(timeissupposedtobemidnight)
DATETIME,foradatewithtime.
Thesetypescanhavealsosomeoptions,evencombinedtogether.Themostimportantonesare:
NOT NULL,thisfieldmaynevercontainaNullvalue
AUTO_INCREMENT,thisfieldwillcontainanautomaticallyincrementednumericalvaluewhenever
anewrecordininsertedwithoutaspecificationofthisfieldsvalue
UNIQUE,thisisafieldforwhichallrecordsmustcontainadifferentvalueforthisfield
DEFAULT {default value},anytimeanewrecordisinsertedwithoutspecifyingavalueforthisfield,
thedefaultvaluewillbeused.
NotethattheprimarykeyfieldautomaticallyhasUNIQUEandNOT NULLoptionsandisanindexfield.

Page36of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

4.7.2. Creating tables with HeidiSQL


In HeidiSQL it is possible to create tables with the user
friendly graphical interface. To do it, we rightclick the
mouseonthedatabaseintheleftwindowandwechoose
Create new Table. This opens up a series of tab in the
central window: in the Basic tab we can enter the fields
one by one with their type, in the Options tab there are
advancedoptions(usealwaysMyISAMenginetype,asitis
the fastest) and in the Create code tab we see the
automatically built CREATE TABLE command. To define a
primarykey,auniqueorakeyfield,ratherthanusingthe
Index tab, it is easier to rightclick the mouse on the field
andchooseCreatenewindex.

Version4.1(24/09/2014) Page37of44
PaoloColetti Databasescoursebook

5. Designing a database (level 2)


Twothingsareveryimportantforplanningadatabase.Thefirstoneisafullunderstandthesituation,how
dataareused,howaretheygoingtobeextractedfromthedatabaseandused.Itisagoodideatoread
severaltimestheinstructions(ortowritethemdownincasethedatabaseisyourownidea)andmakebrief
simulations,withpencilonpaperorwithfakeExceldatabasetables,ofwhichkindofdataaregoingtobe
stored inside the database. In this way a lot of small errors and misunderstandings, which can lead to a
structuralchangeofthedatabase,canbeavoided.
The second one is that the schema must be fully completed before data are entered. Entering data is
usually a timespending procedure, either because they are manually typed in or because they are
imported from other tables which must be well structured before the import operation. Modifying the
schema, or just a single relation, after data have been entered, often means huge data cancellation or
restructurewiththeseriousriskoflosingdata.

5.1. Paper diagram


Thefirststepisapaperdiagramoftheschema,containingallthetableswithrelationsandforeignkeys.A
goodstrategyistostartwiththemostexternaltables,theoneswhichdonotcontainforeignkeys.These
tablesarethepillarsofthedatabase,containingusuallyinformationwhichrarelychangessuchasdataon
people,objects,places.Theothertablesshouldautomaticallycomeouteitherasdependentofthesetables
orarejunctiontablesbetweenthem.
For example, lets build database MyFarm for a small fruit farm company, which has fields on which
different crops are cultivated and storages where crops are stored, each crop must all go into the same
storage.WestartwithtablesFieldsandStorages,whichrepresenttheexternaltables.

Fields Crops Storages



Nowwechecktherelations.Sincewehavetherestrictionthatacropmustgointotheoneandnotmore
than one storage, we automatically have a manytoone relation between Crops and Storages, meaning
thatCropsinthisdatabaseisnotanexternaltablebutadependenttable.Cropsneedstheforeignkeyto
therelation,whichcanbeafieldcalledWhereStored.

Fields Crops 1 Storages



WhereStored Name


NowweanalyzetherelationbetweenFieldsandCrops,whichisclearlymanytomanyandthereforeneeds
ajunctiontable.

Fields Plantings Crops Storages


Field 1
1 Name
Name
1 WhereStored
Crop ID

Concerningrelations,itisveryimportanttocheckthat
onetomanyandmanytoonerelationsareproperlyoriented,
norelationisinsteadaonetoone,
manytomanyrelationsaredealtwithappropriatejunctiontablesandeventuallydetailstable,
the 1 side of relations is really a 1 side (either because there are obvious reasons that
guaranteeso,forexampleinarelationbetweenExamsandStudents,orbecausewewantitto
Page38of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

belikethis,forexampleintherelationbetweenSecretariesandStudents)anddoesnotinstead
needamanytomanyrelation.
Inourexample,wethereforeanalyzecarefullythefactthatputtingamanytoonerelationbetweenCrops
andStoragesautomaticallyimpliesthateachcropisstoredinonlyonestoragewhileastoragecanaccept
differentcrops.PlantingstableisajunctiontablebetweenFieldsandCrops:infactinthesamefieldwecan
plantseveralcrops(eitherdividingthefieldinpartsorconsideringanhistoricaldatabasewhereeachfield
can be used for different crops in different years) and, clearly, the same crop can be planted in several
fields.
Nowwecan enrichthe diagram putting all theother fields and assigning primarykeys,checkingthat no
duplicatevaluesoftheprimarykeyexist.Alsoforfieldsspecialattentionmustbetaken,inparticular:
no field may exist which can be calculated automatically, either using the fields of the same
tableorthroughcountingorsummingoraveragingoperationsperformedonothertables,
fields must be put in the appropriate table as changing the table in which a field is located
completely changes the meaning of that field. Usually, a field in an external table is strictly
bound to that object and rarely changes, while fields in the junction tables are bound to the
actionwhichisexpressedbythattable.
Inourexample,weassignprimarykeytoNamefieldforFieldsandStoragestables,keepinginmindthat
neithertwoFieldsnortwoStorageswiththesamenamemayexits.ForPlantingswedecidetouseanID,
whileforCropsacompositeprimarykeycouldbeTypeandSubtype,butinordertoavoidproblemswith
compositeprimarykeywechoosetouseanID.
NotethatfieldIrrigationSystemisintableFields,meaningthatitisafeatureofthefieldregardlessofwhich
cropsareplantedinthefield.IfitwereintablePlantings,itwouldmeanthattheirrigationsystemchanges
accordingtowhichkingofcropandtothefield,ifitwereintableCropsitwouldmeanthatitisboundto
thespecifictypeofCropsandisthustransportedinthefieldaccordingtowhichcropispresent.

Fields Plantings Crops Storages


1 1 Name
Name Field WhereStored
Town Crop 1 ID Town
Street ID Type Address
Slope Number Subtype
PH PlantingDate PlantType
Size MaturityDate HarvestMonth
Accessibility RemovingDate MinimumPH
IrrigationSystem Percentage MaximumPH
PlantsPerHectare
ProductionPerPlant
CostPerPlant
CropValuePerKg

Especiallywhenthepaperdiagramhastobesubmittedtosomebodyelse,itisagoodideatowritenowa
short description of the database clarifying all the nonobvious fields and relations. This is also a good
opportunityforafinalrecheckofthestructure.

5.2. Building the tables


Nowwecanstarttobuildtables.Itisbettertostartfromtheexternalones,i.e.thetableswhichdonot
haveforeignkeys,sincethesetabledonottakevaluesfromothertables,andthengoingontotheother
tables.Foreachfieldtheappropriatetypemustbechosen,andprimarykeysmustbedefined,whileother
fieldsoptionscanbedecidedlater.

Version4.1(24/09/2014) Page39of44
PaoloColetti Databasescoursebook

Forexample,inthepreviousdatabase:
fortableFields:
o Name,Town,andStreetcontaintext(textforAccess; VARCHAR(30) VARCHAR(50) VARCHAR(50) for
MySQL);
o Slope, MinimumPH, MaximumPH and Size contain numbers (number for Access; DECIMAL(4,2)
DECIMAL(3,2) INTEGER forMySQL);
o AccessibilityandIrrigationSystemcontainyes/no(yes/noforAccess; ENUM(yes,no)or INTEGERfor
Mysql);
fortableStorages:Name,Town,andAddresscontaintext(textforAccess; VARCHAR(30) VARCHAR(50)
VARCHAR(50) forMySQL).

fortableCrops:
o IDisanautomaticincrementingnumber(autonumberforAccess, INTEGERwith AUTO_INCREMENT
forMySQL);
o Type, Subtype, Plant Type contain text (text for Access; VARCHAR(30) VARCHAR(30) VARCHAR(10)
forMySQL);
o HarvestMonth can contain text (text for Access; VARCHAR(9) or ENUM(January, , December) for
MySQL)ornumber(numberforAccess;INTEGERforMySQL);
o MinimumPH, MaximumPH, PlantsPerHectare, ProductionPerPlant, CostPerPlant, and
CropValuePerKg contain numbers (number for Access; DECIMAL(3,2) DECIMAL(3,2) INTEGER FLOAT
DECIMAL(6,2) FLOAT forMySQL);

fortablePlantings:
o IDisanautomaticincrementingnumber(autonumberforAccess, INTEGERwith AUTO_INCREMENT
forMySQL);
o Number(numberforAccess;INTEGERforMySQL);
o Percentagecontainnumbers(number,singlewith2decimaldigitsandpercentageformat,forAccess;
DECIMAL(5,2)forMySQL)
o PlantingDate,MaturityDate,andRemovingDatearedates(date/timeforAccess,DATEforMySQL).
ForeignkeysmustbesetinMySQLtothesametypeofthecorrespondingprimarykey.
ForeignkeyscanbesetinAccess
to the same type of the corresponding primary key and then the relation can be built in the
relationshipsdiagram,rememberingtoenforcereferentialintegrity;
asLookupWizardtypechoosingasdisplayedvaluestheappropriatemeaningfulfieldsfromtherelated
table, hiding the primary key when it is an ID (and thus not meaningful for a human operator) or
unhidingitwhenitismeaningful.Thenreferentialintegrityisenforcedintherelationshipsdiagram.

5.2.1. Field options


Oncefieldsaredefinedwecansetalsoalltheoptions.Forexample,inthePlantingstable:
FieldandCropshavetherequired(NOT NULLinMySQL)option;
Number has validation rule >=0 and a validation text Number of planted plants must be zero or
positive(inMYSQLconditionCHECK Number>=0);
PlantingDate is set as required (NON NULL in MySQL) and we set also the option index duplicates
allowed(setitasindexinMySQL)sincewearegoingtodosearchesbasedonthePlantingDate;
MaturityDateandRemovingDatearenotsetto requiredsincewewanttoleavethepossibilitytohave
thesefieldsempty(forexample,ifwedonotknowwhenwewillremovethisplanting);
fields Town in table Fields, Type and SubType in table Crops and Town in table Storages are set as
Index Duplicates OK (INDEX in MySQL), because we imagine they will be frequently used for sorting
andfilteringbytheuser.Automaticallyallprimarykeysarealsosettothesameoption;
Percentagehasvalidationrule Between 0 And 1andvalidationtextPercentageofusedfieldmustbe
between0%and100%(inMySQLcondition CHECK (Percentage BETWEEN 0 AND 1)).Wecanalsoset
Page40of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

default value to 0 and option required (in MySQL DEFAULT 0 and NOT NULL), since it is important to
alwaysknowhowmuchfieldeachplantinguses.
Somefieldsclearlyadmitpredeterminedvalues.InAccessadropdownmenucanbeeasilycreatedusing
the Lookup Wizard with values directly typed in. For example, to Plant Type we add a dropdown menu
withanonmandatorypredeterminedlistcontainingthemostcommonplanttypes(tree,bush,vine,herb);
wechooseanonmandatorylistsinceothervaluescanexist.SimilarthingfortheHarvestMonth,weinsert
thetwelvemonthsinamandatorypredeterminedlistthroughLookupWizard.InMySQLitisnotpossible
toinsertanonmandatorypredeterminedlist,soonlytheHarvestfieldcanbearrangedinthiswayusing
ENUMtype.

However,ifthesevaluesaremandatoryandwearesurethatnoothervalueisallowed,weshouldinstead
for any database management program build another table with the predetermined list of values and
converttheconsideredfieldintoaforeignkey.Forexample,forHarvestMonththelistismandatorysince
onlytwelvemonthsexists.ThuswebuildaMonthstableandweconnecttheHarvestMonthtoitbuilding
the relation and the dropdown menu using the Lookup Wizard displaying simply the Month, always
rememberingtoenforcereferentialintegrity.

Crops
WhereStored
ID
Type
Subtype
Months
PlantType 1 Month

HarvestMonth

5.2.2. Table validation rule
WhenapplyingtablevalidationrulesinAccessoraconditioninMySQLwemustalwayskeepinmindthata
rulecanusuallyinvolveonlyfieldswhicharerequired.Ifweinvolvefieldswhicharenotrequired,wemust
always add to the validation rule the possibility that the nonrequired fields be empty otherwise the
validationrulewillautomaticallyfail.
Forexample,avalidationrulefortablePlantingswhichconditionthatMaturityDateandRemovingDatebe
not before PlantingDate, keeping in mind that MaturityDate and PlantingDate are not required, is for
Access
( [MaturityDate] Is Null Or [PlantingDate] <= [MaturityDate] ) And ( [RemovingDate] Is Null Or [PlantingDate] <=
[RemovingDate] )
WhileforMySQLitisverysimilar
( MaturityDate IS NULL OR PlantingDate <= MaturityDate ) AND ( RemovingDate IS NULL OR PlantingDate
<= RemovingDate )
EachpartoftheruleistruewhenthefieldiseitheremptyorafterPlantingDate.

5.3. Inserting data


Only now data can be inserted, either manually or importing them from tables in other formats as
explainedinsection2.2.3forAccessandinsection4.6forMySQL.

Version4.1(24/09/2014) Page41of44
PaoloColetti Databasescoursebook

6. Technical documentation (level 9)


At the end of every database it is mandatory to write a technical documentation. Building a database
without a documentation means forcing other people to go through all the relations, tables, tables
options,forms,queries,andreportstounderstandwhattheydoandhowaretheybuilt.Thisisespecially
truewhentheschemaiscomplicated,inparticulartojustifythepresenceortheabsenceofjunctionand
detailstables.
Writing the technical documentation while the database is built is a good idea to produce a full
documentationwithoutforgettinganystep.Duringitswriting,weshouldrememberthatitisaddressedto
technicians,i.e.itisnotnecessarytoexplainhowAccessworks,whatarethemotivationsofthedatabase
anditsculturalbackgroundandthatthedocumentationcanalsobeschematic.
Technicaldocumentationstartswithageneraldescriptionofthedatabase,ofthedataitcontainsandof
howtheyarehandled.Thenitcontainsanoverviewoftherelationships,betterwithapictureofthetables
andrelations,andajustificationofthetypeofeachrelation,especiallyoftheoneswhicharenotobvious.
Then, going through table by table, each field is presented with its type and all its options, giving also a
further explanation for the fields whose content is not obvious from the fields name. Now queries are
presented, clearly writing their criteria, sorting, grouping, virtual fields, and also how they follow the
relationswhenretrievingdata.Thislastthing,frequentlyoverlooked,iscrucialsinceoftentwotablescan
beconnectedfollowingdifferentpathsofrelations,withthequeryresultingindifferentresults.Finallywe
presentformsandreports,withalltheirfeatures.

6.1. MyFarm example


MyFarmdatabasecontainsdataaboutthecropsorganizationofasmallfruitfarm.Wehandledataabout
which crops are planted, the fields and their physical features and where the fruits are stored once
removedfromtheplants.Thedatabasecontainsalsohistoricalrecords,sincewhenaplantingisremovedit
isnotdeletedfromthedatabasebutsimplyaRemovingDateisadded.

6.1.1. Schema
TheschemahasthemaintablesStorages,whichcontainsstoragesplacesdata,Fields,whichcontainsthe
physicaldataofthefields,andCrops,whichcontainsthebiologicalandcommercialfeaturesofthecrops.
Moreover, a junction table Plantings is used to describe what, where and when is planted: this table
connects Crops and Fields via a manytomany relation. The other relation is a onetomany between
StoragesandCrops,whichunderlinesthefactthateachcropcanbeputonlyinonestorage.

WhereStored

Page42of44 Version4.1(24/09/2014)
Databasescoursebook PaoloColetti

6.1.2. Tables and fields


Warning:fieldtypesareforAccess.ForMySQLtheymustbeproperlyadapted.
TableFieldscontains
Name:text,primarykey;
Town,Street:text;
Slope,pH,Size:number;
Accessibility:yes/no.Thisindicateswhetheraroadarrivestothefield;
IrrigationSystem:yes/no,required,defaultvalueno.
TableStoragescontains
Name:text,primarykey;
Town,Address:text;
TableCropscontains
ID:autonumber,primarykey;
Type,HarvestMonth:text;
Planttype:text.Thiscontainstheplanttypesuchastree,bushorherb.
Subtype:text.Thiscontainsthespecifictypeoffruit,forexamplePearKaiser,AppleGolden;
MinimumPH,MaximumPH,PlantsPerHectare,ProductionPerPlant:number;
CostPerPlant,CropValuePerKg:currency;
WhereStored:foreignkey,takesvaluesfromStoragestable.
TablePlantingscontains
ID:autonumber,primarykey;
Crop:foreignkey,takesvaluesfromCropstable.
Field:foreignkey,takesvaluesfromFieldstable.
Number:number.ThisindicateshowmanyplantsofCroptypeareplantedinField;
PlantingDate,MaturityDate,RemovingDate:date/time;
Percentage:number.Thisindicateswhichpercentageofthefieldisusedforthiscrop.

6.1.3. Tables and fields with options


Warning:fieldoptionsareforAccess.ForMySQLtheymustbeproperlyadapted.
TableFieldscontains
Name:textlimitedto50characters,primarykey,required,nozerolength;
Town,Street:textlimitedto50characters;
Slope:number,integerwith0decimaldigits,defaultvalue0;
MinimumPHandMaximumPH:number,singlewith2decimaldigits,defaultvalue5,mustbebetween
0and10;
Size:number,singlewith2decimaldigits,defaultvalue0,mustbepositive;
Accessibility:yes/no.Thisfieldindicateswhetheraroadarrivestothefield;
IrrigationSystem:yes/no,required,defaultvalueno.
TableStoragescontains
Name:textlimitedto50characters,primarykey,required,nozerolength;
Town,Address:textlimitedto50characters;
TableCropscontains
ID:autonumber,primarykey;
Type: text limited to 50 characters, required, no zero length text, indexed (with duplicates), non
mandatorypredeterminedlistwithvaluesApple,Pear,Strawberry,Grapevine;

Version4.1(24/09/2014) Page43of44
PaoloColetti Databasescoursebook

Subtype: text limited to 50 characters, required. This contains the specific type of fruit, for example
PearKaiser,AppleGolden;
Plant type: text limited to 50 characters, nonmandatory predetermined list with values Tree, Bush,
Vine,Herb;
HarvestMonth: text limited to 50 characters, nonmandatory predetermined list with the twelve
monthsasvalues,thereisanextratableMonths,containingthetwelvemonths,forwhichthisfield
isaforeignkey;
MinimumPH, MaximumPH: number, single with 2 decimal digits, required, default value 5, must be
between0and10;
PlantsPerHectare:number,integerwith0decimaldigits,defaultvalue0,mustbepositive;
ProductionPerPlant:number,singlewith2decimaldigits,defaultvalue0,mustbenonnegative;
CostPerPlant:currency,euroformat,singlewith2decimaldigits,defaultvalue0;
CropValuePerKg:currency,euroformat,singlewith2decimaldigits,defaultvalue0;
WhereStored:foreignkey,takesvaluesfromStoragestable.
TablePlantingscontains
ID:autonumber,primarykey;
Crop:foreignkey,takesvaluesfromCropstable.
Field:foreignkey,takesvaluesfromFieldstable.
Number: number, integer with 0 decimal digits, default value 0, must be nonnegative. This field
indicateshowmanyplantsofCroptypeareplantedinField;
PlantingDate:date/time,formatdd/mm/yyyy,required,indexed(duplicatesallowed);
MaturityDate,RemovingDate:date/time,formatdd/mm/yyyy;
Percentage: number, percentage format, single with 2 decimal digits, default value 0%, must be
between0%and100%.Thisindicateswhichpercentageofthefieldisusedforthiscrop.
To table Plantings there is also a table validation rule that checks that MaturityDate and RemovingDate,
whentheyexist,arenotbeforePlantingDate.

6.1.4. Forms
Form Insert/Modify Crops shows the crops with their storage, using the relation between Storages and
Crops.Itispossibletomodifythemortoinsertnewones,butdeletionofcropsisforbidden.

6.1.5. Queries
QueryType+Subtype+FieldWithoutIrrigationQueryshowsthefieldwithoutanIrrigationSystemwiththe
croptypeandcropsubtypethatareandwereplantedinthatfield.ThisqueryretrievesthefieldswithaNo
in the IrrigationSystem field and then, thanks to the junction table Plantings and following the two
relations,retrievesthecropstypeandsubtype.

6.1.6. Reports
Report Crops by Harvest Month displays, grouped by HarvestMonth and then grouped by crop type, the
cropsubtypeandthecorresponding field andtownwherethecropisorwasplanted.Ittakes datafrom
tablesCropsandFields,connectedthroughthejunctiontablePlantingsandthetworelations.

Page44of44 Version4.1(24/09/2014)

You might also like