UNIT I To III Database and Data Processing

Q 1.Discuss the meaning of each of the following terms: a. data b. database c. database management system d.
application program independence Soln. (a) data f. views e. data
For end users, this constitutes all the different values connected with the various objects/entities that are of concern to them. (b) database
A shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. (c) database management system
A software system that: enables users to define, create, and maintain the database and provides controlled access to this database. (d) application program
A computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS. (e) data independence
This is essentially the separation of underlying file structures from the programs that operate on them, also called program-data independence. (f) views.
A virtual table that does not necessarily exist in the database but is generated by the DBMS from the underlying base tables whenever its accessed. These present only a subset of the database that is of particular interest to a user. Views can be customized, for
example, field names may change, and they also provide a level of security preventing users from seeing certain data.
Q 2. Describe the five components of the DBMS environment and discuss how they relate to each other. Soln. (1) Hardware: The computer system(s) that the DBMS and the application programs run
on. This can range from a single PC, to a single mainframe, to a network of computers. (2) Software: The DBMS software and the application programs, together with the
operating system, including network software if the DBMS is being used over a network. (3) Data: The data acts as a bridge between the hardware and software components
and the human components. As weve already said, the database contains both the operational data and the meta-data (the data about data). (4) Procedures: The instructions and rules that govern the design and use of the database. This may include instructions on how to log on to the DBMS, make backup copies of the database, and how to handle hardware or software failures. (5) People: This includes the database designers, database administrators (DBAs),
application programmers, and the end-users.
Q 3 : What is DBMS? What are the advantage and disadvantages of this system? Soln. DBMS is a collection of programs that enables users to create and maintain a database. The DBMS is hence a general purpose software system that facilitates the process of defining, constructing, manipulating, and sharing database among various users and applications. Defining a databse involves specifying the datatypes, structures, and constraints for the data to be stored in the database. Constructing the database is the process of storing the data itself on some storage medium that is controlled by DBMS. Manipulating the database includes such functions as querying the database to retrieve specific data, generating reports from the data. Sharing the database allows multiple users and programs to access the database concurrently. Other important function provided by the DBMS includes protecting the database and maintain the database over a long period of time.
Advantages of DBMS Nonredundant data and consistency Easy to access data Data isolation Multiple user can use the data at time Higher Security Integrity ensurance Disadvantages of DBMS Conventional data processing systems are typically designed to run a number of welldefined, preplanned processes. Such systems are often "tuned" to run efficiently for the processes that they were designed for. Although the conventional systems are usually fairly inflexible in that new applications may be difficult to implement and/or expensive to run, they are usually very efficient for the applications they are designed for. The database approach on the other hand provides a flexible alternative where new applications can be developed relatively inexpensively. The flexible approach is not without its costs and one of these costs is the additional cost of running applications that the conventional system was designed for. Using standardised software is almost always less machine efficient than specialized software. ************** Q 4: Why DBMS is better over file system? Soln: Drawbacks of using file systems to store data are as : Data redundancy and inconsistency Multiple file formats, duplication of information in different files Difficulty in accessing data Need to write a new program to carry out each new task Data isolation multiple files and formats Integrity problems Integrity constraints (e.g. account balance > 0) become buried in program code rather than being stated explicitly Hard to add new constraints or change existing ones Atomicity of updates (Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all) Concurrent access by multiple users Concurrent accessed needed for performance, uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance and updating it at the same time Other security problems Database systems offer solutions to all the above problems So we found DBMS much more efficient over file traditional system. **************** Q 5. What are the different Database Users and what are their functions?
Soln. In a database environment, the primary resource is database, and the secondary resource is the DBMS and related software. In any organization many person use same resources so there is a need for a chief administrator to check out and manage these resources. Administering these resources is the responsibility of DBA. The DBA is responsible for authorizing access to the database, for coordinating its uses, and for acquiring software & hardware resources as needed. Database Designers Database designers are responsible for identifying the data to be stored in the database and for choosing particular structure to represent and stored this data. These tasks are mostly undertaken before the database is actually implemented and populated with data. It is the responsibility of database designers to communicate with database users in order to understand their requirements, and prepare a design that meets their requirements. Normally the designers are on the staff of DBA End Users End users are the people who just needed to access to the database for querying, updating and generating reports; the database primarily exists for their use. There are several types of end users as - Sophisticated usersform requests in a database query language Specialized userswrite specialized database applications that do not fit into the traditional data processing framework Nave usersinvoke one of the permanent application programs that have been written previously Examples, people accessing database over the web, bank tellers, Database Administrator Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprises information resources and needs. Database administrator's duties include: Schema definition Storage structure and access method definition Schema and physical organization modification Granting user authority to access the database Specifying integrity constraints Acting as liaison with users Monitoring performance and responding to changes in requirements Q 6. Why are data models used in database? Explain the categories of data models in brief. Soln. A data model is a collection of concepts that can be used to describe the structure of a database. Data models can be broadly distinguished into 3 main categories-
1)high-level or conceptual data models (based on entities & relationships) It provides concepts that are close to the way many users perceive data. 2)lowlevel or physical data models It provides concepts that describe the details of how data is stored in the computer. These concepts are meant for computer specialist, not for typical end users. 3)representational or implementation data models (record-based,object-oriented) It provide concepts that can be understood by end users. These hide some details of data storage but can be implemented on a computer system directly. Q 7. Explain the different data models in brief. Soln. Hierarchical Model The hierarchical data model organizes data in a tree structure. There is a hierarchy of parent and child data segments. This structure implies that a record can have repeating information, generally in the child data segments. Data in a series of records, which have a set of field values attached to it. It collects all the instances of a specific record together as a record type. These record types are the equivalent of tables in the relational model, and with the individual records being the equivalent of rows. To create links between these record types, the hierarchical model uses Parent Child Relationships. These are a 1:N mapping between record types. This is done by using trees, like set theory used in the relational model, "borrowed" from maths. For example, an organization might store information about an employee, such as name, employee number, department, salary. The organization might also store information about an employee's children, such as name and date of birth. The employee and children data forms a hierarchy, where the employee data represents the parent segment and the children data represents the child segment. If an employee has three children, then there would be three child segments associated with one employee segment. In a hierarchical database the parent-child relationship is one to many. This restricts a child segment to having only one parent segment. Hierarchical DBMSs were popular from the late 1960s, with the introduction of IBM's Information Management System (IMS) DBMS, through the 1970s. Network Model The popularity of the network data model coincided with the popularity of the hierarchical data model. Some data were more naturally modeled with more than one parent per child. So, the network model permitted the modeling of many-to-many relationships in data. In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the network model. The basic data modeling construct in the network model is the set construct. A set consists of an owner record type, a set name, and a member record type. A member record type can have that role in more than one set, hence the multiparent concept is supported. An owner record type can also be a member or owner in another set. The data model is a simple network, and link and intersection record types (called junction records by IDMS) may exist, as well as sets between them . Thus, the complete network of relationships is represented by several pairwise sets; in each set some (one) record type is owner (at the tail of the network arrow) and one or
more record types are members (at the head of the relationship arrow). Usually, a set defines a 1:M relationship, although 1:1 is permitted. The CODASYL network model is based on mathematical set theory. Relational Model (RDBMS - relational database management system) A database based on the relational model developed by E.F. Codd. A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields. Properties of Relational Tables: Values Are Atomic Each Row is Unique Column Values Are of the Same Kind The Sequence of Columns is Insignificant The Sequence of Rows is Insignificant Each Column Has a Unique Name Certain fields may be designated as keys, which means that searches for specific values of that field will use indexing to speed them up. Where fields in two different tables take values from the same set, a join operation can be performed to select related records in the two tables by matching values in those fields. Often, but not always, the fields will have the same name in both tables. For example, an "orders" table might contain (customer-ID, product-code) pairs and a "products" table might contain (product-code, price) pairs so to calculate a given customer's bill you would sum the prices of all products ordered by that customer by joining on the product-code fields of the two tables. This can be extended to joining multiple tables on multiple fields. Because these relationships are only specified at retreival time, relational databases are classed as dynamic database management system. The RELATIONAL database model is based on the Relational Algebra. Object-Oriented Model Object DBMSs add database functionality to object programming languages. They bring much more than persistent storage of programming language objects. Object DBMSs extend the semantics of the C++, Smalltalk and Java object programming languages to provide full-featured database programming capability, while retaining native language compatibility. A major benefit of this approach is the unification of the application and database development into a seamless data model and language environment. As a result, applications require less code, use more natural data modeling, and code bases are easier to maintain. Object developers can write complete database applications with a modest amount of additional effort. According to Rao (1994), "The object-oriented database (OODB) paradigm is the combination of object-oriented programming language (OOPL) systems and persistent systems. The power of the OODB comes from the seamless treatment of both persistent data, as found in databases, and transient data, as found in executing programs."
In contrast to a relational DBMS where a complex data structure must be flattened out to fit into tables or joined together from those tables to form the in-memory structure, object DBMSs have no performance overhead to store or retrieve a web or hierarchy of interrelated objects. This one-to-one mapping of object programming language objects to database objects has two benefits over other storage approaches: it provides higher performance management of objects, and it enables better management of the complex interrelationships between objects. This makes object DBMSs better suited to support applications such as financial portfolio risk analysis systems, telecommunications service applications, world wide web document structures, design and manufacturing systems, and hospital patient record systems, which have complex relationships between data. Q 8: Explain the following. 1. Generalization & specification 2. Aggregation 3. Attributes Soln.
4. 5. 6.
Total & Partial constraints Recursive Relationship Cardinality ratio
1. Generalization & specialization The process of designating sub groupings within an entity set is called specialization, as an entity set may include subgroupings of entities that are distinct in some way from other entities in the set. For ex. Consider an entity set person, with attributes person_id, name, street, city. A person, may be further classified as one of the following: (i) Employee (ii) customer Each of these person type is described by the set of attributes that includes all the attributes of entity set person plus possible additional attributes. i.e customer entity may be described further by the attribute credit_rating, whereas employee entities may be described by the attribute salary. The specialization of person allows us to differentiate among persons according to whether they are employees or customers.
In terms of ER-diagrams, specialization is depicted by a triangle component labeled ISA. 1. Generalization There are similarities between the customer entity set and the employee entity set in the sense that they have several attributes that are conceptually the same across the two entity sets: normally, the identifier, name, street, and city attributes. This commonalities can be expressed as Generalization, Which is a containment relationship that exists between a higher-level entity set and one or more lower-level entity set. Generalization is used to emphasis the similarities among lower-level entity sets and to hide the differences. 2. Aggregation One limitation of the ER-Model is that it can not express relationship among relationships. The best way to model a situation such as the one just described is use aggregation. Aggregation is an abstraction through which relationships are treated as higher level entities.
Total & Partial constraints
If every entity in a entity set has at least one relation in a relationship set then it is known as total participation. Ex: Employee entity set , salary-paid relationship set, salary-code entity set. While if only some of the entities have relations in a relationship set then it is known as partial participation. Ex: Employee entity set, on-leave relationship set, dates entity set. For example here it is a ER-model showing the total and partial constraints
4. Recursive Relationship Each entity type that participates in a relationship type plays a particular role in the relationship. The role name signifies the role that a participating entity from the entity type plays in each relationship instance, and helps to explain what the relationship means. For example, in the works_for relationship type, Employee or worker and department plays the role of department or employer. Role names are not technically necessary in relationship types where all the participating entity types are distinct, since each participating entity type name can be used as the role name. However, in some cases the same entity type participates more than once in a relationship type in different roles. In such cases the role name becomes essential for distinguishing the meaning of each participation. Such relationship types are called Recursive Relationships.
Employee
Supervision
E1 E2 E3 E4 E5 E6 . . .
. . . . . .
2 1 2 2 2 1 1 2 1 1
r1 r2 r3 r4 r5 r6 . . .
. . . . . .
Fig: A recusive relationship SUPERVISION between Employee in the supervisor role(1) and Employee in the subordinate role(2). The supervision relationship type relates an employee to a supervisor, where both employee and supervisor entities are members of the same Employee type. Hence the employee entity type participates twice in SUPERVISION: once in the role of supervisor and once in the supervisee. 5.Attributes Each entity has attributes, or particular properties that describe the entity. For example, student has properties of his own Student Identification number, name, and grade. A particular value of an attribute, such as 93 for the grade, is a value of the attribute. Most of the data in a database consists of values of attributes. The set of all possible values of an attribute, such as integers from 0 to 100 for a grade, is the attribute domain. In an ER model, an attribute name appears in an oval that has a line to the corresponding entity box. Also we can say, an entity is represented by a set of attributes, that is, descriptive properties possessed by all members of an entity set. Examples: customer = (customer-name, social-security, customer-street, customer-city) account = (account-number, balance) Various types of attribute are as. Simple and composite attributes Stored and derived attributes Single and multiple valued attributes Each entity has attributes, or particular properties that describe the entity. For example, employee has properties of his own SSN, name, salary etc. A particular value of an attribute, such as 59999 for the salary, is a value of the attribute. Most of the data in a database consists of values of attributes. The set of all possible values of an attribute,
such as char from 0 to 255 for a name, is the attribute domain. In an ER model, an attribute name appears in an oval that has a line to the corresponding entity box, Simple and composite attributes Attributes which can not be divided into subparts, are called simple attributes, on the other hand attributes which can be further divided into subparts are known as composite attributes for ex name attribute can be further divided into three parts as first anme, middle name and last name in same manner address can be expressed as street, city and pin code etc. Stored and derived attributes The attributes which are derived from the value of another related attribute are called derived attributes. for example customer entity has an attribute age if the customer entity set also has an attribute DOB, we can calculate age from DOB & current date. Thus age is an derived attribute and DOB is a stored attribute. Single and multiple valued attributes Attribute which a single value for a particular entity are known as single valued attribute for example SSN no of employee can not have more then one value, and attributes which can have more than one value for a particular entity are known as multiple valued attributes. For example dependent name of employee college degree of any student phone number 6. Cardinality ratio The cardinality of a relation is the number of tuples it contains It changes as tuples are added or deleted The cardinality is a property of the extension of the relation It is determined from the instance of the relation at any given moment *************** Q 9: Draw the ER-Diagram of Banking System and Hospital System. Soln. (i) ER-Diagram for Banking system
(ii)ER-Diagram for Hospital Management Patients, Doctors, Beds, Examines, BedAssigned, Accounts, HasAccount. patients, entity set with attributes SSNo, LastName, FirstName, HomePhone, Sex, DateofBirth, Age, Street, City, State, Zip. Doctors, entity set with attributes SSNo, LastName, FirstName, OfficePhone, Pager, Specialty. examines, relational set with attributes Date, Time, Diagnosis, Fee. beds, entity set with attributes RoomNumber, BedNumber, Type, Status, PricePerHour. Bed_assigned, relational set with attributes DateIn, TimeIn, DateOut, TimeOut, Amount. accounts, weak entity set with attributes DateIn, DateOut, Amount. has_account, relational set with no Attributes
*************** Q 10: Define various components of SQL; define keys in sql. Or Explain the all SQL statements in brief. Soln. SQL is made up of 4 components: DDL( Data Definition Language) CREATE, ALTER, DROP DML( Data Manipulation Language) SELECT, INSERT, UPDATE, DELETE. DCL( Data Control Language) GRANT, REVOKE TCL( Transaction control statements) COMMIT, ROLLBACK
DDL Statement Syntax: Syntax for Create : Sql> create table <table name> ( Colmn1 data_type1, Column2 data_type2, Column n datatype3 ); Syntax for Alter: Sql> alter table <table name> [add/modify] (column datatype); Syntax for drop: Sql> drop [table/index/synonym/view] <object name>; DML statement syntax: Syntax for Select: Sql> Select [all / * / dinstinct / column_list] from <table_name> [where condition]; Syntax for Insert: Sql> insert into <table name> (column list) values (values); Syntax for update: Sql> update <table name> set <column= value>[where condition]; DCL statement syntax: Syntax for grant: Sql> grant <privilege_name> on <object name> to <user name> [with grant option]; Syntax for revoke: Sql> revoke <privilege_name> on <object_name> from <user_name>; TCL statement syntax: Syntax for commit:
Sql> commit; It is used for perment saving of data in memory; Syntax for rollbak: Sql> rollback; If data not saved previously but change has been applied the before coimmit we can rollback it. Q11. Differentiate among the primary key, foreign key and super key. Soln. Difference between primary, foreign and super key. The relational model separates the conceptual and physical view of data in an effective way so that correctness is enforced simply by controlling values, rather than pointers or other physical structure. Uniqueness and integrity are enforced by means of keys i.e. identify uniquely each tuple in a relation by the values of its attributes Superkey is an attribute or a set of attributes that uniquely identifies a tuple within a relation. A superkey may contain additional attributes that are not necessary for unique identification Candidate key is a super key such that no proper subset is a superkey within the relation. A candidate key, K, for a relation R has two properties: Uniqueness-> in each tuple of R, the values of K uniquely identify that tuple. Irreducibility->No proper subset of K has the uniqueness property There may have several candidate keys for a relation When a key consists of more than one attribute, we call it a composite key Primary key is the candidate key that is selected to identify tuples uniquely within the relation. Since a relation has no duplicate tuples, it is always possible to decide a primary key! The candidate keys that are not selected to be the primary key are called alternate keys. Foreign key is an attribute or set of attributes within one relation that matches the candidate key of some (possibly the same) relation When an attribute appears in more than one relation, its appearance usually represents a relationship between tuples of the two relations. Self-relationship is possible in one relationship. We say the foreign key in a relation targets the primary key attribute in the home relation. **************** Q 11: How relational algebra is different from relational calculus? How they are similar. Soln. Difference between Relational algebra and relational calculus
Relational calculus Relational algebra 1 One declarative expression is written 1 A sequence of operations are to be . to specify a retrieval request . written (true that these operations can be nested to form a single expression) 2 A declarative, non procedural 2 A certain order among the operations is . . always explicitly specified. 3 No description of how to evaluate a 3 Welty and stemple 1981, Hanson and . query; a calculus expression specifies . hanson 1987,88 conducted number of that what is to retrieved rather then experiments and indicated that user how to retrieve it. Any retrieval that prefer procedural language to solve the can be specified in the relational problem, as algebra provides a algebra can also be specified here. collection of explicit operations as join, union, projection, etc. that can be used to tell the system how actually to build some desired relation from the given relations in the database. 4 Calculus formulation is descriptive, 4 Algebra formulation is prescriptive, . describes what the problem is. . prescribes a procedure for solving that problem. 5 Closer to a natural language 5 More like a programming language. . . Similarities in relational algebra and relational calculus There are some limitations with both the formal languages as We can not do Aggregate operations Recursive queries Complex (non-tabular) structures Also most of these are expressible in SQL, OQL, XQuery using other special operators and sometimes we even need the power of a Turing-complete programming language. ************* Q 12: Define the following 1. Existential Quantifier 2. universal quantifier 3. range relation 4. safe expression Soln. Existential quantifier: The quantifier() is called an existential quantifier because a formula(t)(F)is true if there exists some tuple that makes F true. For example: 1. Retrieve the name & address of all employee who work for the Research department. {t.fname,t.lname,t.add| Employee(t) and (d) (department(d) and d.dname=Research and d.dnumber=t.dno)} 2. For every project located in chennai, list the project no, the controlling department no., and department managers last name, DOB & address.
{p.pnumber, p.dnum, m.lname, m.dob, m.address|project(p) and employee(m) and p.plocation=CHENNAI and ((d)(Department(d) and p.dnum=d.dnumber and d.MgrEno=m.Eno))} Universal quantifier: The quantifier( ) is called an universal quantifier because a formula(t)(F) is true if every possible tuple that can be assigned to free occurrence of it in F is substituted for t, and F is true for every substitutions. For example,1. find the name of the employees who work on all the projects controlled by department no 5. {e.Lname, e.Fname | Employee(e) and ( x)(NOT (Project (x)) or NOT (x.dno=5) or ((w )(work_on(w) and w.essn=e.ssn and x.pnumber=w.pno))))} 2. find the name of employees who have no dependent. {e.fname, e.lname, | Employee(e) and (not (d)(Dependent(d) and e.ssn=d.essn))} Range relation: The tuple relational calculus is based on specifying a number of tuple variables. Each tuple variable usually ranges over a particular database relation, meaning that the variable may take as its value any individual tuple from that relation. A simple tuple relational calculus query is of the form {t| COND (t)} Where t is a tuple variable and COND(t) is a conditional expression involving t. The result of such a query is the set of all tuples t that satisfy COND(t). For example, to find all employees whose salary is above Rs. 50,000, we can write the following tuple calculus expression: {t| Employee(t) and t.salary>50,000} The condition Employee(t) specifies that the range relation of tuple variable t is employee. Each employee tuple t that satisfies the condition t.salary>50000 will be retrieved. Safe Expression : Whenever we use universal quantifiers, existential quantifiers or negation of predicates in a calculus expressions, we must make sure that the resulting expression makes sense. A safe expression in relational calculus is one that is guaranteed to yield a finite number of tuples as its result; otherwise it is said to be unsafe. For ex, the expression {t| NOT (Employee (t))} is unsafe because it yields all tuples in universe that are not EMPLOYEE tuples, which are infinitely numerous. *************** Q 13: Consider the following relation, where the primary keys are underlined emlplyoee(Person_name, street, city) Works(Person_name, company_name, salary) Company(Copany_name, city) Manages(Person_name, manager_name) write an expression in relational algebra to express each of the following query. (a) Find the name of all employees who live in the same city and on the same street as do their manager. (b) Find name of all employees who earn more than every employee of Small Bank Corporation. (c) Modify the database so that Jones now lives in Newtown. Soln.
(a)person_name (employeemanages)
(manager_name=employee2.person_name
employee.street=employee2.street
employee.city=employee2.city) (employee2(employee))
(b)person_name(works) (works.person_name(works (works.salary works2.salary works2.company_name= Small Bank Corporation)
works2(works)) )
(c) employee person_name, street, Newtown jones(employee) ) (employee- (person_name=jones(employee)) ******************* Q 14: consider the following database schema and write down the sql queries regarding the statements given below. account(account_number,branch_name,balance ) branch(branch_name,branch_city,assets) customer(customer_name ,customer_street,customer_city) loan(loan_number,branch_name,amount) depositor(customer_name,account_number) borrower(customer_name,loan_number) 1. Set of names of customers with accounts at a branch where Hayes has an account. 2. Set of names of branches whose assets are greater than the assets of some branch in Brooklyn. 3. Names of customers with both accounts and loans at Perryridge branch. 4. Names of customers with an account but not a loan at Perryridge branch 5. Set of names of customers at Perryridge branch, in alphabetical order. Soln. 1. select distinct D.customer_name from depositor D, account A where D.account_number = A.account_number and branch_name in (select branch_name from depositor Dh, account Ah where Dh.account_number = Ah.account_number and D.customer_name = 'Hayes'); 2. select distinct T.branch_name from branch T, branch S where T.assets > S.assets and S.branch_city = 'Brooklyn'; 3. select customer_name from customer where exists (select * from account, depositor where account.account_number = depositor.account_number and depositor.customer_name = customer.customer_name and ( person_name=
branch_name = 'Perryridge') and exists (select * from loan, borrower where loan.loan_number = borrower.loan_number and borrower.customer_name = customer.customer_name and branch_name = 'Perryridge'); 4. select customer_name from customer where exists (select * from account, depositor where account.account_number = depositor.account_number and depositor.customer_name = customer.customer_name and branch_name = 'Perryridge') and not exists (select * from loan, borrower where loan.loan_number = borrower.loan_number and borrower.customer_name = customer.customer_name and branch_name = 'Perryridge'); 5. select distinct customer_name from borrower, loan, branch where borrower.loan_number = loan.loan_number and loan.branch_name = 'Perryridge' order by borrower.customer_name; **********
Q 15: What is view? What are advantages of choosing views? Soln. To reduce redundant data to the minimum possible. Oracle allows the creation of an object called a View. A view is mapped to a SELECT sentence. The table on which the view is based is described in the FROM clause of the SELECT statement. The reason to create View When data security is required When data redundancy is to be kept to the minimum while maintaining data security Views:Logical data is how we want to see the current data in our database. Physical data is how this data is actually placed in our database. Views are masks placed upon tables. This allows the programmer to develop a method via which we can display predetermined data to users according to our desire, Views may be created for the following reasons: The DBA stores the views as a definition only. Hence there is no duplication of data. Simplifies Questionnaires. Can be Questionnaires as a base table itself.
Provides data security. Avoids data redundancy. Creation of Views:Syntax:CREATE VIEW viewname AS SELECT colname1, colname2,. FROM tablename WHERE columnname=expression_list; Renaming the columns of a view:Syntax:CREATE VIEW viewname AS SELECT newcolumnname. FROM tablename WHERE columnname=expression_list; Selecting a data set from a viewSyntax:SELECT columnname, columnname FROM viewname WHERE search condition; Destroying a viewSyntax:DROP VIEW viewname; ************ Q 16: Explain triggers and assertions with suitable examples Soln. Database Triggers:Database triggers are procedures that are stored in the database and are implicitly executed (fired) when the contents of a table are changed. Use of Database Triggers:Database triggers support Oracle to provide a highly customized database management system. Some of the uses to which the database triggers can be put to customize management information in Oracle are as follows:A Trigger can permit DML statements against a table anly if they are issued, during regular bussiness hours or on predetermined weekdays. A trigger can also be used to keep an audit trail of a table along with the operation performed and the time on which the operation was performed. It can be used to prevent invalid transactions. Enforce complex security authorizations. How to apply Database Triggers:A trigger has three basic parts:A triggering event or statement. A triger restriction A trigger action. Types of Triggers:Using the various options , four types of triggers can be created:-
Before Statement Trigger:- Before executing the triggering statement, the trigger action is executed. Before Row Trigger:- Before modifying the each row affected by the triggering statement and before appropriate integrity constraints, the trigger is executed if the trigger restriction either evaluated to TRUE or was not included. After Ststement Trigger:- After executing the triggering statement and applying any deferred integrity canstraints, the trigger action is executed. After row Trigger:- After modifying each row affected by the triggering statement and possibly applying appropriate integrity constraints, the trigger action is executed for the current row if the trigger restriction either evaluates to TRUE or was not included. Syntax For Creating Trigger:The syntax for Creating the Trigger is as follows:Create or replace Trigger<Triggername> {Before,After} {Delete, Insert, Update } On <Tablename> For Each row when Condition Declare <Variable declarations>; <Constant Declarations>; Begin <PL/SQL> Subprogram Body; Exception Exception Pl/SQL block; End; How to Delete a Trigger:The syntex for Deleting the Trigger is as follows:Drop Trigger <Triggername>; ASSERTIONS An assertion is a predicate expressing a condition that we want the database always to satisfy. Assertions are specific to the SQL standard. Syntax: create assertion <name> check (<predicate>) When an assertion is specified, the DBMS tests for its validity. This testing may introduce a significant amount of computing overhead (query valuation), thus assertions should be used carefully. Note that assertions are not offered in Oracle/SQL !! Example: For each product, there must be at least two suppliers. create assertion two suppliers check (not exists (select _ from offers O1 where not exists (select _ from offers where O1.SName <> SName and O1.Prodname = Prodname)) )
******************** Q 17: Consider the cust_banker_branch realation and the FD given below. Prove the given relation schema is in 3NF. cust_banker_branch=(customer_id , employee_id, branch_name,type) employee_id branch_name Soln: A relation schema (R) is in 3NF with respect to a set (F) of FDs it, for all dependencies in closure (F+) of the form X Y, where XR and YR, at least one of the following holds:
1. XY is trivial FDs. 2. X is a super key for R 3. Each attributes A in Y X is contained in a candidate key for R Note that the third condition above does not say that a single candidate key should contain all the attributes in Y X; each attributes A in Y X may contain in different candidate key Let us now consider the cust_branch_banker relation to find our relation that is in 3NF or not. Here X = employee_id, Y = branch_name, and Y X = branch_name . It turns out that branch_name is contained in a candidate key and that, therefore, cut_banker_branch is in 3NF. Let us describe the fact. Given FDs employee_id branch_name custome_id,branch_name employee_id Holding, the FDs Customer_id, employee_id cust_branch_banker Holds as a result of (custmer_id, employee_id) being the primary key. This makes (customer_id, employee_id) a candidate key. Of course, it does not contain branch_name, so we need to see if there are another candidate key. As it turns out, the set of attributes(custome_id, branch_name) is a candidate key . Let us see why this is the case. Given a particular customer_id value and branch_name value, we know there is only one associated employee_id value because custome_id,branch_name employee_id But then, for that particular customer_id values and employee_id value, there can be only one associated cust_branch_banker tuple.
Q 18. What is normalization? Explain the basic Concept of normalization OR What do you mean by normalization? Why it is important for a database to be in normalized form? Soln: Normalization is a design technique that is widely used as a guide in designing relational databases. Normalization is essentially a two step process that puts data into tabular form by removing repeating groups and then removes duplicated from the relational tables. Normalization theory is based on the concepts of normal forms. A relational table is said to be a particular normal form if it satisfied a certain set of constraints. There are currently five normal forms that have been defined. In this section, we will cover the first three normal forms that were defined by E. F. Codd. Basic Concepts The goal of normalization is to create a set of relational tables that are free of redundant data and that can be consistently and correctly modified. This means that all tables in a relational database should be in the third normal form (3NF). A relational table is in 3NF if and only if all non-key columns are (a) mutually independent and (b) Fully dependent upon the primary key. Mutual independence means that no non-key column is dependent upon any combination of the other columns. The first two normal forms are intermediate steps to achieve the goal of having all tables in 3NF. In order to better understand the 2NF and higher forms, it is necessary to understand the concepts of functional dependencies and lossless decomposition. **************** Q 19. What is Function Dependency? Explain the Closure of a Set of Functional Dependencies? Soln. The concept of functional dependencies is the basis for the first three normal forms. A column, Y, of the relational table R is said to be functionally dependent upon column X of R if and only if each value of X in R is associated with precisely one value of Y at any given time. X and Y may be composite. Saying that column Y is functionally dependent upon X is the same as saying the values of column X identify the values of column Y. If column X is a primary key, then all columns in the relational table R must be functionally dependent upon X. A short-hand notation for describing a functional dependency is: R.x >; R.y which can be read as in the relational table named R, column x functionally determines (identifies) column y. Full functional dependence applies to tables with composite keys. Column Y in relational table R is fully functional on X of R if it is functionally dependent on X and not functionally dependent upon any subset of X. Full functional dependence means that
when a primary key is composite, made of two or more columns, then the other columns must be identified by the entire key and not just some of the columns that make up the key. Closure of a Set of Functional Dependencies We need to consider all functional dependencies that hold. Given a set F of functional dependencies, we can prove that certain other ones also hold. We say these ones are logically implied by F. 1. Suppose we are given a relation scheme R=(A,B,C,G,H,I), and the set of functional dependencies:
2. A
A C CG H CG I B H then the functional dependency A H is logically implied. 7. To see why, let t1 and t2 be tuples such that 3. 4. 5. 6.
8.
As we are given A B , it follows that we must also have Further, since we also have B H , we must also have
T1[H]=t2[H] T1[B]=t2[B]
t1[A]= t2[A]
Thus, whenever two tuples have the same value on A, they must also have the same value on H, and we can say that A H . Closure of Attribute Sets 1. To test whether a set of attributes X is a superkey, we need to find the set of attributes functionally determined by X. 2. Let X be a set of attributes. We call the set of attributes determined by Xunder a set F of functional dependencies the closure of under F, denoted X+. 3. The following algorithm computes X +:
4. result := X
5. while (changes to result) do

6. for each functional dependency YZ in F do 7. begin 8. if Y result 9. then result := result Z ; 10. end ****************
Q 20. What do you mean by Armstrongs axioms for finding FDs? OR What is functional depandency ? Profe the inference rules of FDs. Soln. An functional dependency(FD) X-> Y, is inferred from a set of set of dependencies F specified on R if X-> Y holds legal relation state of R; that is, whenever r satisfies all the functional dependencies that can be inferred from F. To determine systematic way to
infer dependencies, we must use a set of inference rules that can be used to infer new dependencies from a given set of dependencies. There are five inference rules(IR1 TO IR5) and inference rules IR1 to IR3 are known as Armstrongs axioms, these are as follows. o Reflexivity rule: if X is a set of attributes and Y X , then XY holds. o Augmentation rule: if XY holds, and Z is a set of attributes, then ZXZY holds. o Transitivity rule: if XY holds, andYZ holds, then XZ holds. These rules are sound because they do not generate any incorrect functional dependencies. They are also complete as they generate all of . To make life easier we can use some additional rules, derivable from Armstrong's Axioms: o Union rule: if XY andXZ , then XYZ holds. o Decomposition rule: if XYZ holds, then XY and XZ both hold. o Pseudotransitivity rule: if XY holds, and ZY A holds, then XZA holds. To compute FDs, we can use some rules of inference called Armstrong's Axioms: IR1(reflexive rule): If XY, then X->Y PROOF OF IR1 Suppose that XY and that two tuples t1 and t2 exist in some relation instance r of R such that t1[X] = t2[X]. Then t1[Y] = t2[Y] because X Y; hence, X->Y must hold in r. IR2(augmentation rule):{X->Y}| = XZ->YZ PROOF OF IR2 Assume that X->Y holds in a relation r of R but that XZ->YZ, does not hold. Then there must exist two tuples t1 and t2in r such that 1. t1[X]=t2[X] 2. t1[Y]=t2[Y] 3. t1[XZ]=t2[XZ] 4. t1[YZ]t2[YZ] This is not possible because from (1) and (3) we deduce (5) t1[Z]=t2[Z], and from (2) and (5) we deduce (6) t1[YZ] = t2[YZ], contradicting(4).
IR3(Transitive rule):{X->Y,Y->Z}| = X->Z PROOF OF IR2 Assume that (1) X->Y and (2)Y->Z both hold id relation r. Then for two tuples t1 and t2 in r such that t1[X]=t2[X], we must have (3) t1[Y]=t2[Y], from assumption (1); hence we must also have (4) t1[Z]=t2[Z], from (3)and assumption (2); hence X->Z must hold in r.
Q 21. Explain the 3NF & BCNF? How is BCNF more desirable than 3NF? Expalin with example.
OR Prove with suitable example that is BCNF more stronger than 3NF? Soln. Third Normal Form A relation schema R is in 3NF with respect to a set F of functional dependencies if for all functional dependencies in of the form XY, where X R and Y R, at least one of the following holds: XYis a trivial functional dependency. X is a superkey for schema R. Each attribute A in YX is contained in a candidate key for R. A database design is in 3NF if each member of the set of relation schemas is in 3NF. We now allow functional dependencies satisfying only the third condition. These dependencies are called transitive dependencies, and are not allowed in BCNF. As all relation schemas in BCNF satisfy the first two conditions only, a schema in BCNF is also in 3NF. BCNF is a more restrictive constraint than 3NF. Boyce-Codd Normal Form Boyce-Codd normal form (BCNF) is a more rigorous version of the 3NF deal with relational tables that had (a) multiple candidate keys, (b) composite candidate keys, and (c) candidate keys that overlapped . BCNF is based on the concept of determinants. A determinant column is one on which some of the columns are fully functionally dependent. A relational table is in BCNF if and only if every determinant is a candidate key. BCNF is a more restrictive constraint than 3NF. There is an advantage to 3NF in that it is always possible to obtain a design without specifying a lossless join or dependency preservation. Nevertheless, there is a disadvantage to 3NF.if we do not eliminate all transitive dependencies, we may have to use null values to represent some of the possible meaningful relationships among data items, and there is the problem of repetition of information. For example, consider the Banker_schema given below and its associated functional dependencies. Customer_name Banker_name Branch_name Jon es Johnson Perryridge Smith Johnson Perryridge Hayes Johnson Perryridge Jackson Johnson Perryridge Curry Johnson Perryridge Turner Johnson Perryridge Since banker_name->branch_name, we may want to represent relationships between values for banker_name and value for branch_name in our database. If we are to do so, however, either there must be a corresponding value for customer_name, or we must use a null value for the attribute customer_name.
The other difficulty with the Banker-Schema is repetition of information, as in our example the information indicating that Johnson is working at perryridge branch is repeated. If we have to choose between BCNF and dependency preservation with 3NF, it is generally preferable to opt for 3NF.If we can not test for dependency preservation efficiently, that will be risk the integrity of the data in our database. Thus, we normally choose to retain dependency preservation and select BCNF. ************* Q 22. What is Relational Database Design? What are pitfalls in Relational DB Design? Explain with example. Soln: The goal of relational database design is to generate a set of schemas that allow us to Store information without unnecessary redundancy. Retrieve information easily (and accurately). The relation schema which had unnecessary redundancy is database with complexity and does not have data accuracy. This type of database is known as bad database. A bad design may have several pitfalls / properties, including: Repetition of information. Inability to represent certain information. Loss of information. Representation of Information Suppose we have a schema, Lending-schema,
Lending-schema = (bname, bcity, assets, cname, loan#, amount)
and suppose an instance of the relation is Figure . Bname Bcity Asscts SFC Burnaby 2M SFC Burnaby 2M Downtown Vancouver 8M Figure : Sample lending relation. Cname Tom Mary Tom Loan# L-10 L-20 L-50 amount 10K 15K 50K
A tuple t in the new relation has the following attributes: o t[assets] is the assets for t[bname] o t[bcity] is the city for t[bname] o t[lon#] is the loan number made by branch t[bname] to t[cname]. o t[amount] is the amount of the loan for If we wish to add a loan to our database, the original design would require adding a tuple to borrow:
SFU, L-31, Turner, 1K)
In our new design, we need a tuple with all the attributes required for Lendingschema. Thus we need to insert
(SFU, Burnaby, 2M, Turner, L-31, 1K)
We are now repeating the assets and branch city information for every loan. o Repetition of information wastes space. o Repetition of information complicates updating.
Under the new design, we need to change many tuples if the branch's assets change. Let's analyze this problem: o We know that a branch is located in exactly one city. o We also know that a branch may make many loans. o The functional dependency bname bcity holds on Lending-schema. o The functional dependency bname loan# does not. o These two facts are best represented in separate relations. Another problem is that we cannot represent the information for a branch (assets and city) unless we have a tuple for a loan at that branch. Unless we use nulls, we can only have this information when there are loans, and must delete it when the last loan is paid off. *************** Q 23. Why decomposition is required for Normalization of database? What are the Desirable Properties of Decomposition? OR What are the different properties of decomposition? Explain with example. OR What do you mean by the decomposition of relation? How decomposition reduces redundancy? Soln. Decomposition Decomposition eliminates redundancy by decomposing a relation into several relations in a higher normal form. It is important to check that decomposition does not introduce new problems. A good decomposition allows us to recover the original relation Let R be a relation with attributes A1 ,A2 ,An Create two relations R1 and R2 with attributes B1 , B2 ,Bm C1 ,C2 ,Ci Such that: B1 , B2 ,Bm C1 ,C2 ,Ci = A1 ,A2 ,AN And R1 is the projection of R on B1 , B2 ,Bm R2 is the projection of R on C1 ,C2 ,Ci Problems with Decomposition: Some queries become more expensive Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation information loss. Decomposition of relation in to number of relations minimizes the redundancy and complicity of relation. Careless decomposition, however, may lead to another form of bad design Let us consider a Lending-schema
Bname SFC SFC Downtown
Bcity Burnaby Burnaby Vancouver
Asscts 2M 2M 8M
Cname Tom Mary Tom
Loan# L-10 L-20 L-50
amount 10K 15K 50K
The Lending-schema is decomposed into two schemas as given to minimize the redundancy and complicity of relation.
1. 2. Branch-customer-schema = (bname, bcity, assets, cname) Customer-loan-schema = (cname, loan#, amount)
We construct our new relations from lending by: branch-customer Bname Bcity SFC Burnaby SFC Burnaby Downtown Vancouver
customer-loan
branch-customer = bname,bcity,assets,cname(lending) customer-loan = cname,loan#,amount(lending)
Asscts 2M 2M 8M
Cname Tom Mary Tom
Cname Tom Mary Tom
Loan# L-10 L-20 L-50
amount 10K 15K 50K
Figure : The decomposed lending relation. The decomposition of leanding schema is good if natural join on the tweo new schema reconstruct the lending relation other wise the decomposition is wrong. It appears that we can reconstruct the lending relation by performing a natural join on the two new schemas. Desirable Properties of Decomposition Lossless-Join Decomposition Dependency Preservation Repetition of Information Lossless-Join Decomposition: The decomposition of a relation R on X1 and X2 is lossless if the join of the projections of R on X1 and X2 is equal to R itself (that is, not containing false tuples). The decomposition of R into X and Y is lossless with respect to F if and only if the closure of F contains either: X Y (X intersect Y) X, that is: all attributes common to both X and Y functionally determine ALL the attributes in X OR X Y (X intersects Y) Y, that is: all attributes common to both X and Y functionally determines ALL the attributes in Y
We can claim that the decomposition is lossless. How can we decide whether decomposition is lossless? Let R be a relation schema. Let F be a set of functional dependencies on R. Let R1 and R2 form a decomposition of R. The decomposition is a lossless-join decomposition of R if at least one of the following functional dependencies are in: 1. R1R2R1 2. R2R1R2 Why is this true? Simply put, it ensures that the attributes involved in the natural join R1R2 are a candidate key for at least one of the two relations. This ensures that we can never get the situation where spurious tuples are generated, as for any value on the join attributes there will be a unique tuple in one of the relations. Considering the join of lending schema. Figure 7.3 shows what we get by computing branch-customer customer-loan.: Bname Bcity Asscts Cname Loan# amount SFC Burnaby 2M Tom L-10 10K SFC Burnaby 2M Tom L-50 50K SFC Burnaby 2M Mary L-20 15K Downtown Vancouver 8M Tom L-10 10K Downtown Vancouver 8M Tom L-50 50K Figure 7.3: Join of the decomposed relations. We notice that there are tuples in branch-customer customer-loan that are not in lending. How did this happen? o The intersection of the two schemas is cname, so the natural join is made on the basis of equality in the cname. o If two landings are for the same customer, there will be four tuples in the natural join. o Two of these tuples will be spurious - they will not appear in the original lending relation, and should not appear in the database. o Although we have more tuples in the join, we have less information. o Because of this, we call this a lossy or lossy-join decomposition. A decomposition that is not lossy-join is called lossless-join decomposition. The only way we could make a connection between branch-customer and customer-loan was through cname. When we decomposed Lending-schema into Branch-schema and Loan-info-schema, we will not have a similar problem?
Branch-schema = (bname, bcity, assets)
Branch-loan-schema = (bname, cname, loan#, amount) We'll now show our decomposition is lossless-join by showing a set of steps that generate the decomposition: o First we decompose Lending-schema into
o o Branch-schema = (bname, bcity, assets)
Loan-info-schema = (bname, cname, loan#, amount)
o o o o o o o o
Since bname assets bcity, the augmentation rule for functional dependencies implies that
bname bname assets bcity
Since Branch-schema Borrow-schema = bname, our decomposition is lossless join. Next we decompose Borrow-schema into
Loan-schema = (bname, loan#, amount)
Borrow-schema = (cname, loan#) As loan# is the common attribute, and

loan# amount bname
This is a lossless-join decomposition Dependency Preservationboth X and Y functionally determine ALL t Another desirable property in database design is dependency preservation. We would like to check easily that updates to the database do not result in illegal relations being created. It would be nice if our design allowed us to check updates without having to compute natural joins. To know whether joins must be computed, we need to determine what functional dependencies may be tested by checking each relation individually. o Let F be a set of functional dependencies on schema R. o Let {R1,R2,..,Rn}be a decomposition of R. o The restriction of F to Ri is the set of all functional dependencies in that include only attributes of Ri. o Functional dependencies in a restriction can be tested in one relation, as they involve attributes in one relation schema. o The set of restrictions F1,F2,..,Fn is the set of dependencies that can be checked efficiently. o We need to know whether testing only the restrictions is sufficient. o Let F=F1,F2,..,Fn. o F' is a set of functional dependencies on schema R, but in general, F F. o However, it may be that F+=F+ o If this is so, then every functional dependency in F is implied by F', and if F' is satisfied, then F must also be satisfied. o A decomposition having the property that F+=F+ is a dependencypreserving decomposition. 2. We can now show that our decomposition of Lending-schema is dependency preserving. o The functional dependency
o o o o o bname assets bcity
can be tested in one relation on Branch-schema. The functional dependency

loan# amount bname Can be tested in Loan-schema.
3. As the above example shows, it is often easier not to apply the algorithm shown to test dependency preservation, as computing F+ takes exponential time.
4. An Easier Way To Test For Dependency Preservation Really we only need to know whether the functional dependencies in Fand not in F' are implied by those in F'. In other words, are the functional dependencies not easily checkable logically implied by those that are? Rather than compute F+and F+ , and see whether they are equal, we can do this: o Find F - F', the functional dependencies not checkable in one relation. o See whether this set is obtainable from F' by using Armstrong's Axioms. o This should take a great deal less work, as we have (usually) just a few functional dependencies to work on. Use this simpler method on exams and assignments (unless you have exponential time available to you).
Repetitionof Information Our decomposition does not suffer from the repetition of information problem. o Branch and loan data are separated into distinct relations. o Thus we do not have to repeat branch data for each loan. o If a single loan is made to several customers, we do not have to repeat the loan amount for each customer. o This lack of redundancy is obviously desirable. o We will see how this may be achieved through the use of normal forms. **********
Q24: What you understand by ER diagram? How do we start an ERD? Soln.

An entity-relationship (ER) diagram is a specialized graphic that illustrates the interrelationships between entities in a database. ER diagrams often use symbols to represent three different types of information. Boxes are commonly used to represent entities. Diamonds are normally used to represent relationships and ovals are used to represent attributes. There are three basic elements in ER models: Entities are the "things" about which we seek information. Attributes are the data we collect about the entities. Relationships provide the structure needed to draw information from multiple entities. Generally, ERD's look like this:
Developing an ERD Developing an ERD requires an understanding of the system and its components. Before discussing the procedure, let's look at a narrative created by Professor Harman.
Consider a hospital: Patients are treated in a single ward by the doctors assigned to them. Usually each patient will be assigned a single doctor, but in rare cases they will have two. Heathcare assistants also attend to the patients, a number of these are associated with each ward. Initially the system will be concerned solely with drug treatment. Each patient is required to take a variety of drugs a certain number of times per day and for varying lengths of time. The system must record details concerning patient treatment and staff payment. Some staff are paid part time and doctors and care assistants work varying amounts of overtime at varying rates (subject to grade). The system will also need to track what treatments are required for which patients and when and it should be capable of calculating the cost of treatment per week for each patient (though it is currently unclear to what use this information will be put). How do we start an ERD? 1. Define Entities: these are usually nouns used in descriptions of the system, in the discussion of business rules, or in documentation; identified in the narrative (see highlighted items above). 2. Define Relationships: these are usually verbs used in descriptions of the system or in discussion of the business rules (entity ______ entity); identified in the narrative (see highlighted items above). 3. Add attributes to the relations; these are determined by the queries,and may also suggest new entities, e.g. grade; or they may suggest the need for keys or identifiers. What questions can we ask? a. Which doctors work in which wards? b. How much will be spent in a ward in a given week? c. How much will a patient cost to treat? d. How much does a doctor cost per week? e. Which assistants can a patient expect to see? f. Which drugs are being used? 4. Add cardinality to the relations Many-to-Many must be resolved to two one-to-manys with an additional entity Usually automatically happens Sometimes involves introduction of a link entity (which will be all foreign key) Examples: Patient-Drug 5. This flexibility allows us to consider a variety of questions such as: a. Which beds are free? b. Which assistants work for Dr. X? c. What is the least expensive prescription? d. How many doctors are there in the hospital? e. Which patients are family related? 6. Represent that information with symbols. Generally E-R Diagrams require the use of the following symbols:
Reading an ERD It takes some practice reading an ERD, but they can be used with clients to discuss business rules. These allow us to represent the information from above such as the E-R Diagram below:
Q 25: Explain Enhanced ER modeling Concept in brief. Soln. ENHANCED ER MODEL The ER modeling concepts are sufficient for representing many database schemas for the usual database applications. These include the applications used in business and industry. Since 1970 we have had the development of much more complex databases. EXAMPLES: engineering design, telecommunications, multimedia, data mining, data warehousing, geographic information, etc.. These types of databases require additional semantic data modeling concepts. The following extensions will be presented in enhanced ER model: 1. Class/subclass relationships and type inheritance. 2. Specialization and generalization. 3. Constraints on specialization and generalization. 4. Union construct. SUBCLASSES, SUPERCLASSES, AND INHERITANCE The EER model includes all the modeling concepts of the ER model. SUBCLASS
An entity type is used to represent both a type of entity and the entity set or collection of entities of that type that exist in the database. EXAMPLE: Entity type EMPLOYEE describes the type of each employee entity. It also refers to the current set of EMPLOYEE entities in the COMPANY database. Employees may be subgrouped according to what they do, such as: Managers, Secretaries, technicians, etc.. These are subsets of EMPLOYEE. We call these subgroupings SUBCLASSES. The relationship between the superclass (EMPLOYEE) and its subclasses in general are referred to as a superclass/subclass relationship. EXAMPLE: EMPLOYEE/Secretary All subclasses are members of superclasses, but not all superclasses are members of a subclass. An important concept associated with subclasses is type inheritance. Thus the subclasses of EMPLOYEE inherit the attributes of EMPLOYEE. SPECIALIZATION AND GENERALIZATION Specialization is the procedure of defining a set of subclasses. The entity of which these subclasses are parts is called the superclass of the specialization. Q26. Explain Database tuning techniques in brief. Soln. Performance tuning is not easy and there arent any silver bullets, but you can go a surprisingly long way with a few basic guidelines. In theory, performance tuning is done by a DBA. But in practice, the DBA is not going to have time to scrutinize every change made to a stored procedure. Learning to do basic tuning might save you from reworking code late in the game. Below is my list of the top 15 things , the developers should do as a matter of course to tune performance when coding. These are the low hanging fruit of SQL Server performance they are easy to do and often have a substantial impact. Doing these wont guarantee lightening fast performance, but it wont be slow either. 1. Create a primary key on each table you create and unless you are really knowledgeable enough to figure out a better plan, make it the clustered index (note that if you set the primary key in Enterprise Manager it will cluster it by default). 2. Create an index on any column that is a foreign key. If you know it will be unique, set the flag to force the index to be unique. 3. Dont index anything else (yet). 4. Unless you need a different behaviour, always owner qualify your objects when you reference them in TSQL. Use dba. sysdatabases instead of just sysdatabases. 5. Use set nocount on at the top of each stored procedure (and set nocount off) at the bottom.
6. Think hard about locking. If youre not writing banking software, would it matter that you take a chance on a dirty read? You can use the NOLOCK hint, but its often easier to use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED at the top of the procedure, then reset to READ COMMITTED at the bottom. 7. I know youve heard it a million times, but only return the columns and the rows you need. 8. Use transactions when appropriate, but allow zero user interaction while the transaction is in progress. I try to do all my transactions inside a stored procedure. 9. Avoid temp tables as much as you can, but if you need a temp table, create it explicitly using Create Table #temp. 10. Avoid NOT IN, instead use a left outer join - even though its often easier to visualize the NOT IN. 11. If you insist on using dynamic sql (executing a concatenated string), use named parameters and sp_executesql (rather than EXEC) so you have a chance of reusing the query plan. While its simplistic to say that stored procedures are always the right answer, its also close enough that you wont go wrong using them. 12. Get in the habit of profiling your code before and after each change. While you should keep in mind the depth of the change, if you see more than a 10-15% increase in CPU, Reads, or Writes it probably needs to be reviewed. 13. Look for every possible way to reduce the number of round trips to the server. Returning multiple resultsets is one way to do this. 14. Avoid index and join hints. 15. When youre done coding, set Profiler to monitor statements from your machine only, then run through the application from start to finish once. Take a look at the number of reads and writes, and the number of calls to the server. See anything that looks unusual? Its not uncommon to see calls to procedures that are no longer used, or to see duplicate calls. Impress your DBA by asking him to review those results with you. Q27. Explain OLTP in brief. Soln.
OLTP
Databases tend to get split up into a variety of diffrent catagoies based on their application and requirements. All of these diffrent catagories naturally get nifty buzz words to help classify them and make distinctions in features more apparent. The most popular buzz work (well, acronymn anyway) is OLTP or Online Transaction Proccessing. Other classifications include Descision Support Systems (DSS), Data Warehouses, Data Marts, etc. OLTP databases, as the name implies, handle real time transactions which inherently have some special requirements. If your running a store, for instance, you need to ensure that as people order products they are properly and effiently updating the inventory tables while they are updating the purchases tables, while their updating the customer tables, so on and so forth. OLTP databases must be atomic in nature (an entire transaction either succeeds or fails, there is no middle ground), be consistant (each transaction leaves the affected data in a consistant and correct state), be isolated (no transaction affects the states of other transactions), and be durable (changes resulting from commited transactions are persistant). All of this can be a fairly tall order but is essential to running a successful OLTP database.
Because OLTP databases tend to be the real front line warriors, as far as databases go, they need to be extremely robust and scalable to meet needs as they grow. Whereas an undersized DSS database might force you to go to lunch early an undersized OLTP database will cost you customers. No body is going to order books from an online book store if the OLTP database can't update their shopping cart in less than 15 seconds. The OLTP feature you tend to hear most often is "row level locking", in which a given record in a table can be locked from updates by any other proccess untill the transaction on that record is complete. This is akin to mutex locks in POSIX threading. In fact OLTP shares a number of the same problems programmers do in concurrent programming. Just as you'll find anywhere, when you've got a bunch of diffrent persons or proccesses all grabbing for the same thing at the same time (or at least the potential for that to occur) your going to run into problems and raw performance (getting your hands in and out as quick as possible) is generally one of the solutions. Q 28: Explain Embedded SQL in detail. Soln. embedded SQL 1. SQL provides a powerful declarative query language. However, access to a database from a general-purpose programming language is required because, o SQL is not as powerful as a general-purpose programming language. There are queries that cannot be expressed in SQL, but can be programmed in C, Fortran, Pascal, Cobol, etc. o Nondeclarative actions -- such as printing a report, interacting with a user, or sending the result to a GUI -- cannot be done from within SQL. 2. The SQL standard defines embedding of SQL as embedded SQL and the language in which SQL queries are embedded is referred as host language. 3. The result of the query is made available to the program one tuple (record) at a time. 4. To identify embedded SQL requests to the preprocessor, we use EXEC SQL statement: EXEC SQL embedded SQL statement END-EXEC Note: A semi-colon is used instead of END-EXEC when SQL is embedded in C or Pascal. 5. Embedded SQL statements: declare cursor, open, and fetch statements. EXEC SQL declare c cursor for select cname, ccity from deposit, customer where deposit.cname = customer.cname and deposit.balance > :amount
END-EXEC where amount is a host-language variable. EXEC SQL open c END-EXEC This statement causes the DB system to execute the query and to save the results within a temporary relation. A series of fetch statement are executed to make tuples of the results available to the program. EXEC SQL fetch c into :cn, :cc END-EXEC The program can then manipulate the variable cn and cc using the features of the host programming language. A single fetch request returns only one tuple. We need to use a while loop (or equivalent) to process each tuple of the result until no further tuples (when a variable in the SQLCA is set). We need to use close statement to tell the DB system to delete the temporary relation that held the result of the query. EXEC SQL close c END-EXEC 6. Embedded SQL can execute any valid update, insert, or delete statements. 7. Dynamic SQL component allows programs to construct and submit SQL queries ar run time. 8. SQL-92 also contains a module language, which allows procedures to be defined in SQL.
Q 29.Describe the main phases involved in database design. Soln. Database design is made up of two main phases: logical and physical database design. Logical database design is the process of constructing a model of the data used in a company based on a specific data model, but independent of a particular DBMS and other physical considerations. In the logical database design phase we build the logical representation of the database, which includes identification of the important entities and relationships, and then translate this representation to a set of tables. The logical data model is a source of information for the physical design phase, providing the physical database designer with a vehicle for making tradeoffs that are very important to the design of an efficient database.
Physical database design is the process of producing a description of the implementation of the database on secondary storage; it describes the base tables, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security restrictions. In the physical database design phase we decide how the logical design is to be physically implemented in the target relational DBMS. This phase allows the designer to make decisions on how the database is to be implemented. Therefore, physical design is tailored to a specific DBMS. Q 30. Identify important factors in the success of database design. Soln. The following are important factors to the success of database design: Work interactively with the users as much as possible. Follow a structured methodology throughout the data modeling process. Employ a data-driven approach. Incorporate structural and integrity considerations into the data models. Use normalization and transaction validation techniques in the methodology. Use diagrams to represent as much of the data models as possible. Use a database design language (DBDL). Build a data dictionary to supplement the data model diagrams. Be willing to repeat steps.
Q 31.Discuss the main activities associated with each step of the logical database design methodology. Soln. The logical database design phase of the methodology is divided into two main steps. In Step 1 we create a data model and check that the data model has minimal redundancy and is capable of supporting user transactions. The output of this step is the creation of a logical data model, which is a complete and accurate representation of the company (or part of the company) that is to be supported by the database. Purpose of Step 1 is to build a logical data model of the data requirements of a company (or part of a company) to be supported by the database.
Each logical data model comprises: entities, relationships, attributes and attribute domains, primary keys and alternate keys, integrity constraints. The logical data model is supported by documentation, including a data dictionary and ER diagrams, which youll produce throughout the development of the model. In Step 2 we map the ER model to a set of tables. The structure of each table is checked using normalization. Normalization is an effective means of ensuring that the tables are structurally consistent, logical, with minimal redundancy. The tables are also checked to ensure that they are capable of supporting the required transactions. The required integrity constraints on the database are also defined. Q 32. Discuss the main activities associated with each step of the physical database design methodology. Soln. Physical database design is divided into six main steps: Step 1 involves the design of the base tables and integrity constraints using the available functionality of the target DBMS. Step 2 involves choosing the file organizations and indexes for the base tables. Typically, DBMSs provide a number of alternative file organizations for data, with the exception of PC DBMSs, which tend to have a fixed storage structure. Step 3 involves the design of the user views originally identified in the requirements analysis and collection stage of the database system development lifecycle. Step 4 involves designing the security measures to protect the data from unauthorized access. Step 5 considers relaxing the normalization constraints imposed on the tables to improve the overall performance of the system. This is a step that you should undertake only if necessary,
because of the inherent problems involved in introducing redundancy while still maintaining consistency. Step 6 is an ongoing process of monitoring and tuning the operational system to identify and resolve any performance problems resulting from the design and to implement new or changing requirements.

UNIT I To III Database and Data Processing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

UNIT I To III Database and Data Processing

Uploaded by

Copyright:

Available Formats

Q 1.Discuss the meaning of each of the following terms: a. data b. database c. database management system d.

application program independence Soln. (a) data f. views e. data

application programmers, and the end-users.

Total & Partial constraints Recursive Relationship Cardinality ratio

Total & Partial constraints

(b)person_name(works) (works.person_name(works (works.salary works2.salary works2.company_name= Small Bank Corporation)

5. while (changes to result) do

Bname SFC SFC Downtown

Bcity Burnaby Burnaby Vancouver

Cname Tom Mary Tom

Loan# L-10 L-20 L-50

amount 10K 15K 50K

branch-customer = bname,bcity,assets,cname(lending) customer-loan = cname,loan#,amount(lending)

Cname Tom Mary Tom

Cname Tom Mary Tom

Loan# L-10 L-20 L-50

amount 10K 15K 50K

Loan-info-schema = (bname, cname, loan#, amount)

Borrow-schema = (cname, loan#) As loan# is the common attribute, and

can be tested in one relation on Branch-schema. The functional dependency

Q24: What you understand by ER diagram? How do we start an ERD? Soln.

You might also like