OODBS

Chapter III Object-Oriented Database Systems
Nguyen Kim Anh Dept. of Information Systems, SoICT, HUT
Outline
Object-Oriented Data Model Object-Oriented Database System(OODBS) Object-Oriented Data Definition Language Object-Oriented Query Language Index organizations for OODBS Query Optimization in OODBS
Object-Oriented Data Model

Definition of an object Object Identity Object Structure Object-Oriented Concepts Graphical representation of a complex object Comparisons of the states of two objects for equality Class Schema
Definition of an object
Objects User defined complex data types
An object has structure or state (variables) and methods (behavior/operations)
An object is described by four characteristics

Identifier: a system-wide unique id for an object Name: an object may also have a unique name in DB (optional) Lifetime: determines if the object is persistent or transient Structure: Construction of objects using type constructors
Object Identity
unique identity for each independent object stored in the database created by a unique, system-generated object identifier, or OID
Object Identity
properties of OID
immutable: the OID value of a particular object should not change
OID should not depend on any physical address attribute values of the object
Each OID is used only once.

Most OO database systems allow for the representation of both objects and values (having no OIDs)
Object Structure
The state (current value) of a complex object may be constructed from other objects (or other values) by using certain type constructors Can be represented by (i,c,v)
i is an unique id c is a type constructor v is the object state

Type constructors
Basic types: atom (integer,real,string,Boolean,) Structured type: tuple Collection type: array vs. list (order), set vs. bag (unorder)
Object Structure
Object states
c=atom ::: an atomic value from the domain c=set ::: a set of object identifiers {i1, i2, , in} c=tuple ::: a tuple of <a1:i1, a2:i2, , an:in> c=list ::: an ordered list [i1, i2, , in] c=array ::: a single-dimensional array of object identifiers c=bag
Object-Oriented Concepts
Abstract Data Types
Class definition provides extension to complex attribute types Implementation of operations and object structure hidden Sharing of data within hierarchy scope, supports code reusability Operator overloading
Encapsulation
Inheritance
Polymorphism
Example 1: Complex Object

o8=(i8, tuple, <NAME:i5, NUMBER: i4, MANAGER: i9, LOCATIONS: i7, MEMBERS: i10, CONTROL: i11>) ::: department 5 o9=(i9, tuple, <MANAGER:i12, MANAGER_START_DATE: i6>) o10=(i10,set,{i12, i13, i14}) o11=(i11,set,{i15, i16, i17})
LEGEND:
Graphical representation of a complex object object

tuple set
i8: O8 tuple
Object instance of department type

MEMBERS CONTROL
NAME
NUMBER MANAGER
LOCATIONS
i5: O5 atom
V5
i4: O4 i9: O9 tuple atom

V4
i 7: O 7 set V7
i10: O10 set

V10 O3 V3
i11: O11 set

V11
Research
V9 O1 V1
O2 V2
i15: ..... i16: ..... i17: ..... tuple tuple
Houston Bellaire Sugarland tuple MANAGER MANAGERSTRATDATE

i 6: O 6 atom V6 1988-05-22 i12: ..... tuple
i13: ..... tuple i14: ..... tuple
Comparisons of the states of (Current Values) two objects for equality

identical states (deep equality)
the graphs representing their states are identical in every respect, including the OIDs at every level
equal states (shallow equality)

the graph structures must be the same all the corresponding atomic values in the graphs should be the same allow some corresponding internal nodes in the two graphs to have objects with different OIDs
Example 2: Identical vs. Equal Object States

o1=(i1, tuple, <a1:i4, a2:i6>) o2=(i2, tuple, <a1:i5, a2:i6>) o3=(i3, tuple, <a1:i4, a2:i6>) o4=(i4, atom, 10) o5=(i5, atom, 10) o6=(i6, atom, 20) o1 and o2 have equal states o1 and o3 have identical states o4 and o5 have identical states o4 and o5 are equal but not identical
Class Schema
Outline
Object-Oriented Database System (OODBS)

A database system that incorporates all the important object-oriented concepts Some additional features
Unique Object identifiers Persistent object handling
Advantages of OODBS
Designer can specify the structure of objects and their behavior (methods) Better interaction with object-oriented languages such as Java and C++ Definition of complex and user-defined types Encapsulation of operations and userdefined methods
Outline
Object-Oriented Data Definition Language(OODDL)

Using OODDL to define Employee, Date, and Department types define type Employee: tuple ( name: string; birthday: date; address: string; sex: string salary: int; workfor: Department; supervisor: Employee; supervisee: set(Employee); manage: Department; workon set(Project); )
Attributes refer to Employee, Department , Project objects relationship among objects
Using OODDL to define Employee, Date, and Department types (Cont.)

Inverse reference: dept. of employee employee of dept.
define type tuple (
Date: year: month: day:
integer; integer; integer; ); string; integer; tuple (manager: Employee; startdate: Date; ); set (string); set (Employee); set (Project); );
set of references
define type Department tuple ( name: number: manager: locations: members: control:
Specifying Object Behavior via Class Operations

In relational model, selecting, inserting, deleting and modifying tuples are generic.
Define the behavior of a type of object based on the operations that can be externally applied to object of that type
create (insert) or destroy (delete) objects update the object state retrieve parts of the object state apply some calculations combination of retrieval, calculation, and update
Specifying Object Behavior via Class Operations (Continued)

interface define the name and arguments (parameters) of each operation signature (included in the class definition) implementation method (defined using programming languages) it is invoked by sending a message to the object to execute the corresponding method
Operations 1. object constructors 2. object destructor 3. object modifier 4. retrieval
Using OODDL to define Employee and Department classes

define class Employee: type tuple ( name:
birthday: address: sex: salary: workfor: supervisor: supervisee: manage: workon: string; date; string; string int; Department; Employee; set(Employee); Department; set(Project);
type definition
definition of operations
operations age integer; create_emp: Employee; destroy_emp : boolean; end Employee;
Using OODDL to define Employee and Department classes (Continued)

define class type tuple ( Department name: number: manager: locations: members: control: string; integer; tuple (manager: startdate: set (string); set (Employee); set (Project); );
type definition
Employee; Date; );
definition of operations
operations number_of_emps : integer; create_dept: Department, destroy_dept: boolean; assign_emp (e: Employee): boolean; (* adds a new employee to the department *) remove_emp (e: Employee): boolean; (* removes an employee from the department *) end Department;
Class Operations
object constructor create a new object destructor destroy an object object modifier modify various attribute of an object dot notation d.no_of_emps where d is a reference to a department object and no_of_emps is an operation
refer to attributes of an object: d.dnumber, d.mgr.startdate
Specifying Object Persistence via Naming and Reachability

transient object
exist in the executing program and disappear once the program terminates
persistent object
stored in the database and persist after program termination
naming mechanism
give an object a unique persistent name through which it can be retrieved by this and other program
Reachability
reachability mechanism
make the object reachable from some persistent object an object B is said to be reachable from an object A if a sequence of references in the object graph lead from object A to object B e.g., if o8 is persistent, then all other objects also become persistent (next slide) N defines a persistent collection of objects of class C create a named persistent object N, whose state is a set or list of objects of some class C add objects of C to the set or list and make them reachable from N
LEGEND:
Graphical representation of a complex object object

tuple set
i8: O8 tuple
Object instance of department type

MEMBERS CONTROL
NAME
NUMBER MANAGER
LOCATIONS
i5: O5 atom
V5
i4: O4 i9: O9 tuple atom

V4
i 7: O 7 set V7
i10: O10 set

V10 O3 V3
i11: O11 set

V11
Research
V9 O1 V1
O2 V2
i15: ..... i16: ..... i17: ..... tuple tuple
Houston Bellaire Sugarland tuple MANAGER MANAGERSTRATDATE

i 6: O 6 atom V6 1988-05-22 i12: ..... tuple
i13: ..... tuple i14: ..... tuple
Creating persistent objects by naming and reachability

define class DepartmentSet: type set (Department); operations add_dept(d: Department): remove_dept (d: Department): create_dept_set: destroy_dept_set: end DepartmentSet; boolean; boolean, DepartmentSet; boolean;
persistent name AllDepartments: DepartmentSet ; (* AllDepartments is a persistent named object of type set DepartmentSet*)
.....
d := create_dept ; ..... (* creates a new department object in the variable d *) b := AllDepartments.add_dept (d) ; (* make d persistent by adding it to the persistent named object AllDepartments *) AllDepartments object: extent of the class Department
Differences between traditional databases and OO databases

traditional database models
when an entity type or class is defined in EER, it represents both type declaration and persistent set
OO approaches
a class declaration specifies only the type and operations for a class of objects user must define a persistent object whose value is the collection of references to all persistent
10
Type Hierarchies and Inheritance

type (or class) hierarchy
define new types based on other predefined types (or classes) functions with zero arguments type
type name functions a number of attributes (instance variables) operations (methods)
TYPE_NAME: function, function, , function PERSON: Name, Address, Birthdate, Age, SSN EMPLOYEE subtype-of PERSON: Salary, HireDate, Seniority STUDENT subtype-of PERSON: Major, GPA
Inheritance
multiple inheritance
when T is a subtype of two (or more) types, T inherits the functions (attributes and methods) of both supertypes type lattice instead of type hierarchy if a function is inherited from some common supertype, it is inherited only once ambiguity resolution alarm users system default disallow multiple inheritance
Inheritance (Continued)
Selective Inheritance
a subtype inherits only some of the functions of a supertype an EXCEPT clause may be used to list the functions in a super type that are not to be inherited by the subtype
11
Outline
Object-Oriented Query Language

Declarative query language
Not computationally complete
Syntax based on SQL (select, from, where) Additional flexibility (queries with user defined operators and types)
SQL3 Object-oriented SQL

Foundation for several OO database management systems ORACLE8, DB2, etc New features relational & Object oriented Relational Features new data types, new predicates, enhanced semantics, additional security and an active database Object Oriented Features support for functions and procedures Set-oriented query language
12
Object Query Language (OQL)

Syntax based on SQL (select, from, where) :
select <structured query result> from <class [class variable]> [,<path>.] where <path expressions>
Path-oriented query language

Path : C1.A1.A2... . An-1.An C2 C3... .Cn Path expression : C1.A1.A2... . An-1.An = v
Example of OQL query

The following is a sample query what are the names of the black product? Select distinct p.name From products p Where p.color = black
Valid in both SQL and OQL, but results are different.
Result of the query (SQL)

Original table
Product no P1 P2 P3 Name Ford Mustang Toyota Celica Mercedes SLK Color Black Green Black
Result Name Ford Mustang Mercedes SLK
- The statement queries a relational database. => Returns a table with rows.
13
Result of the query (OQL)

Original table
Product no P1 P2 P3 Name Ford Mustang Toyota Celica Mercedes SLK Color Black Green Black
Result String String Ford MustangMercedes SLK
- The statement queries a objectoriented database => Returns a collection of objects.
Comparison
Queries look very similar in SQL and OQL, sometimes they are the same In fact, the results they give are very different Query returns: OQL
Object Collection of objects
SQL
Tuple Table
Outline
14
Index organizations for OODBS

Path index (PX):
a path P = C1.A1.A2... . An-1.An a path index (PX) on P with Ci, 1i n : {(v,S)/ vDOM(An )and S = {Oi.Oi+1..On / O1.O2..On.v is a instantiation of P}}

Nested index (NX):
a path P = C1.A1.A2... . An-1.An a nested index (NX) on P: {(v,S)/ vDOM(An )and S = {O / O1.O2..On.v is a instantiation of P, Oi=O, 1i n }}

Multi-index (MX): a path P = C1.A1.A2... . An-1.An a multi-index (MX) on P: 1i n{Ii,1, Ii,2,..., Ii,ni} where Ii,j, 1in, 1jni, is a single index on path Cij.Ai and ni is the number of subclasses rooted by Ci a single index for Cij.Ai is {(O,S)/ ODOM(Ai )and S = {O / O. Ai=O} Indexes Ii,j, 1i<n, have OIDs as key values and are called indentity indexes Indexes In1 are called equality indexes
15

Inherited multi-index (IMX): a path P = C1.A1.A2... . An-1.An a inherited multi-index (IMX) on P: 1i n{Ii} where Ii is s class-hierarchy index on path Ci.Ai. a class-hierarchy index associates with each value of an attribute Ai the OIDs of instances of a class Ci and of all its subclasses. an inherited multi-index differs from the multi-index in that it maintains a single index for all classes belonging to same inheritance hierarchy. this technique always requires a number of indexes equal to the path length.
Outline
Query Optimization in OODBS

Algebraic Transformation-based query optimization Graph-based query optimization (using path indexes) Method Materialisation
16
Algebraic Transformation-based query optimization

The object algebra is a many-sorted algebra Algebraic operators are defined for the various kinds of value sets. Operators can be classified as constructors, projection operators performing access to components of a complex value, selection, and iteration.
Object algebra
Algebraic optimization rules

algebraic optimization rules:
validate the defined operators represent semantically equivalent query transformations. allow algebraic expressions to be transformed into semantically equivalent, but more efficiently executable ones.
17
18
Example of algebraic query

The following is a sample OQL query what are the names of employees who work for CS department? Select distinct p.name From employee p Where p.workfor.name = CS Algebraic query:
iS[S.name.v(s)](P[Pname(V(D(workfor(V(p)))))=CS](employee))
Graph-based query optimization using path indexes

Access Path Selection Generalized Index Intersection Query Graph Reductions Generation of Least-Cost Evaluation Plan
Access Path Selection

Eligible indexes for Q, denoted by EI(Q), are the indexes that are useful in query processing; Eligible indexses, for the condition pathi value, are the indexes constructed on `any subpath' of the pathi. Predicates that can (cannot) be processed by indexes are called index processible predicates(IP) (residual predicates(RP))
19

Query Graph
ai/j the link (i.e., the attribute) that connects the classes Ci and Cj ---- the path index constructed on the corresponding path expression

The problem of determining eligible indexes in the query optimization has exponential time complexity.
use a simple index selection heuristic:
select all eligible indexes and pointers take full advantage of the path indexes not compromised by the proposed index selection heuristic.
Generalized Index Intersection (for simple indexes)
20
Generalized Index Intersection (for path indexes)
Query Graph Reductions

Objective of Reductions:
determine the classes that are replaced by the index scans and removes them from the query graph. use Higraph for modeling the process of the query graph reduction.
Higraph has one extra element called supernode that contains one or more subnodes (classes).

The query graph reduction algorithm consists of the following three steps: 1. For query graph QG, determine the set of eligible indexes EI(QG). 2. For each IDX(pathi) EI(Q) 1) remove all primitive classes and edges in pathi. 2) create a new supernode that contains all user-defined classes in pathi; the supernode denotes OID tuples of its subnodes that satisfy the predicates matched with IDX(pathi). (Note: not remove the user-defined classes on pathi since residual predicates may exist for them. 3. If two supernodes (relations) T1 and T2 have a common subnode, perform natural join for them. The join result is denoted by another supernode T12 and the nodes T1 and T2 are removed. We repeat this step until no more supernodes exist in the query graph that share a subnode.
21
22
Generation of Least-Cost Evaluation Plan

The search algorithm generates all possible join orders (or alternative plans) from the RQG, and then estimates evaluation cost for each join order, and finally chooses the least-cost join order based on the cost model. (1) Generation of Search Tree (2) Cost Estimation and The Least-cost Evaluation Plan Generation

Generation of Search Tree
23

The joins of the branch < C1, C2, ...,Cn > can be processed by the sequence of binary joins The cost formula for the binary join of Ci and Ci+1 (using pointer-based sort-merge join algorithm):
cost(Ci JNai Ci+1) = cost(Ci) + sort(Ci, ai) +cost(Ci+1)
Cost Estimation
Method Materialisation
A method materialisation consists:
compute the result of a method once, store the method's result persistently in a database, use the persistent result value when the method is invoked. maintain the materialised results: update the values of materialised methods when objects used for computing them change (base objects)
reduce applications response time for accessing a method's result, especially when its execution takes long time. add methods maintenance cost
in order to improve a system's performance, only the right set of methods should be materialised method materialisation (precomputation, caching) was proposed in the context of indexing techniques and query optimisation.
24
Two important issues arise for method materialisation :
(1) what technique to use for method materialisation, and (2) which methods to materialise?
use the dynamic hierarchical method materialisation technique:

if the method mi is materialised then other methods called by mi are materialised. the system decides whether to materialise a given method or not based on the gathered statistics (method reads and updates of base objects)
Storage Structures
Materialised Methods Dictionary (MMD) contains information about all methods:
a method name and class, the array of input arguments, a method return type, a method implementation, and a flag indicating if a method was materialised.
Storage Structures
Materialised Method Results Structure (MMRS) stores the following information about every materialised method:
(1) the identifier of a method, (2) an object identifier the method was invoked for, (3) the array of input argument values a method was invoked with, (4) the value returned by a method while executed for a given object and for a given array of input argument values.
When materialised method mi is required, then MMRS is searched in order to get the result of mi. If it is not found then, the value of mi is computed and stored in MMRS. When an object used to compute the materialised value of mi is updated or deleted, then the materialised value becomes invalid and is removed from MMRS.
25
Storage Structures
GMC stores pairs of values:
the identifier of a calling method and the identifier of a method being called.
Graph of Method Calls (GMC) represent dependencies between methods, where one calls another one. GMC is used by the procedure that maintains the materialised results of methods.
Storage Structures
In order to invalidate dependent methods the system must be able to find also inverse references in object composition hierarchy. The references are maintained in a data structure called Inverse References Index (IRI).
Storage Structures
Method Value Index (MVI) is an index defined on results of methods. Every method of a class has its own MVI. The index stores the following:
(1) the value of a method input argument, (2) a method result, and (3) an object identifier a method was invoked for.
By using this index, the system is able to quickly find answers to queries that use methods. The content of MVI is filled in with data when methods are materialised.
26
Dynamic Method Materialisation

The dynamic method materialisation technique consists in: (1) gathering method usage statistics and based on the statistics (2) finding methods whose materialisation increases system's performance and methods whose materialisation deteriorates system's performance. A software module, called the method analyser and optimiser does the final selection of methods for materialisation and monitors method access patterns and gathers execution statistics.

Tuning of a system is performed in two following steps. Step 1:
select the set SM of methods for materialisation. materialise results of these methods for their first calls. monitor the usage of the methods and gather execution statistics for the set of transactions using mi and its materialised values called the batch transaction set.
The size of the batch transaction set is parameterised by a system administrator.
Step 2:
identify methods whose materialisation increases system's performance dematerialise automatically methods whose materialisation deteriorates the system's performance

Gathering method usage statistics
For a given method mi the execution statistics include:
method execution times and the number of disk accesses for every object and every set of input argument values, the number of base object updates, the number of reads of mi materialised values, method invalidation times and the number of disk accesses for every object and every set of input argument values, method recomputation times and the number of disk accesses for every object and every set of input argument values, time and the number of disk accesses required for finding an already materialised value.
27

Selecting methods for materialisation
Cost Model
r - number of transactions reading the materialised value v of method mi. u - number of transactions updating a base object of mi. r +u - number of transactions in the batch transaction set. tRMAT - time of reading a materialised value of mi using MMRS. tEXEC - execution time of non-materialised method mi. tREMAT - time of rematerialising value v of mi, after its base object was updated.
All the discussed times include I/O as well as CPU times.

The materialisation of method mi will reduce query response time if the following holds:
represents a coefficient by which an overall system's response time is to be reduced. It takes its value from the range of (0, 1) and it is considered as a tuning parameter set up by an administrator.

In the worst case, i.e. when all branches in the GMC have to be invalidated, the rematerialisation time (tREMAT) includes:
tINV - invalidation time of a materialised result tEXEC -time of computing of a method result tWMAT time of writing the materialised result on disk.
Thus can be expressed as follows:
28

Formula 1 and Formula 2 Formula 3 express the number of updates to the number of reads.
for a given method mi and a given batch transaction set, if the inequality in formula 3 is true, then mis materialisation increase system's performance. Otherwise, mi has to be dematerialised.
Object Oriented Databases

Advantages
Good integration with Java, C++, etc Can store complex information Fast to recover whole objects Has the advantages of the (familiar) object paradigm
Disadvantages
There is no underlying theory to match the relational model Can be more complex and less efficient OODB queries tend to be procedural, unlike SQL
Object Relational Databases

Extend a RDBMS with object concepts
Data values can be objects of arbitrary complexity These objects have inheritance etc. You can query the objects as well as the tables
An object relational database

Retains most of the structure of the relational model Needs extensions to query languages (SQL or relational algebra)
29

OODBS

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

OODBS

Uploaded by

Copyright:

Available Formats

Chapter III Object-Oriented Database Systems

Nguyen Kim Anh Dept. of Information Systems, SoICT, HUT

Object-Oriented Data Model

An object is described by four characteristics

Each OID is used only once.

i is an unique id c is a type constructor v is the object state

Abstract Data Types

Example 1: Complex Object

Graphical representation of a complex object object

Object instance of department type

i4: O4 i9: O9 tuple atom

i10: O10 set

i11: O11 set

i15: ..... i16: ..... i17: ..... tuple tuple

Houston Bellaire Sugarland tuple MANAGER MANAGERSTRATDATE

i13: ..... tuple i14: ..... tuple

Comparisons of the states of (Current Values) two objects for equality

equal states (shallow equality)

Example 2: Identical vs. Equal Object States

Object-Oriented Database System (OODBS)

Object-Oriented Data Definition Language(OODDL)

Using OODDL to define Employee, Date, and Department types (Cont.)

define type tuple (

Date: year: month: day:

Specifying Object Behavior via Class Operations

Specifying Object Behavior via Class Operations (Continued)

Using OODDL to define Employee and Department classes

operations age integer; create_emp: Employee; destroy_emp : boolean; end Employee;

Using OODDL to define Employee and Department classes (Continued)

Specifying Object Persistence via Naming and Reachability

Graphical representation of a complex object object

Object instance of department type

i4: O4 i9: O9 tuple atom

i10: O10 set

i11: O11 set

i15: ..... i16: ..... i17: ..... tuple tuple

Houston Bellaire Sugarland tuple MANAGER MANAGERSTRATDATE

i13: ..... tuple i14: ..... tuple

Creating persistent objects by naming and reachability

Differences between traditional databases and OO databases

Type Hierarchies and Inheritance

Object-Oriented Query Language

SQL3 Object-oriented SQL

Object Query Language (OQL)

Path-oriented query language

Example of OQL query

Result of the query (SQL)

Result Name Ford Mustang Mercedes SLK

Result of the query (OQL)

Result String String Ford MustangMercedes SLK

- The statement queries a objectoriented database => Returns a collection of objects.

Index organizations for OODBS

Index organizations for OODBS

Index organizations for OODBS

Index organizations for OODBS

Query Optimization in OODBS

Algebraic Transformation-based query optimization

Algebraic optimization rules

Algebraic optimization rules

Algebraic optimization rules

Algebraic optimization rules

Example of algebraic query

Graph-based query optimization using path indexes