You are on page 1of 19

Architecture of Distributed Data Base Systems

Sudha Ram and Clark L. Chastain


Department of Management Information Systems, College of Business and Public Administration, University of Arizona, Tucson

Research and development over the last twenty years has culminated in the widespread use of data base management system (DBMS) software. As usage has grown, the desire to link and integrate separate data bases has resulted in substantial effort being directed towards the design of distributed data base systems. This paper presents the major architectures which have emerged for distributed data base systems. The architectures are compared and evaluated. Sixteen distributed data base management system (DDBMS) projects have been surveyed and classified according to the architectures. The various projects represent widely differing stages of effort: academic research, industrial testbeds, and commercial prototypes. The survey reviews important features of the DDBMSs. It does not attempt a qualitative performance comparison. The focus is instead on identification of overall architectural characteristics. The usefulness of the survey lies in the summary information which it imparts on current research, and in the classification scheme for generic distributed data base architectures which it provides.

managed by a single distributed data base management system (DDBMS). A user of the system is typically unaware of the distribution. The advantages of a DDS relative to a centralized data base system include increased system reliability and availability [ 131. For a large organization with multiple geographic locations, the implementation of a DDBMS may be the only feasible means of introducing data base technology in a reasonably controlled manner. More frequently, however, the design and installation of a DDBMS has been an effort undertaken only after several separate DMBSs have already been installed [14]. This paper will present the major architectures that have emerged for distributed data base systems. The architectures will be compared and evaluated. Sixteen DDBMS projects will be surveyed and classified according to the architectures. The various projects represent widely differing stages of effort: academic research, industrial testbeds, and commercial prototypes. The survey reviews important features of the DDBMSs. It does not attempt a qualitative performance comparison. The focus is instead on identification of overall architectural characteristics. The usefulness of the survey lies in the summary information which it imparts on current research, and in the classification scheme for generic distributed data base architectures which it provides. Section 2 begins by describing overall considerations in DDBMS design. Section 3 describes the general DDBMS architectures that have emerged. Section 4 surveys the design features of distributed data base systems that are currently installed or under development. Section 5 concludes by identifying areas for further research in distributed data base design. 2. DISTRIBUTED
CONSIDERATIONS DATA BASE DESIGN

1. INTRODUCTION

Research and development over the last twenty years has culminated in the widespread use of data base management system (DBMS) software. As usage has grown, the desire to link and integrate separate data bases has resulted in substantial effort being directed towards the design of distributed data base systems. Figure 1 illustrates a distributed data base system. A distributed data base system (DDS) is characterized by two properties [5]: 1. Subsets of a data base are created over several computer hosts that may or may not be geographically distinct sites. Sufficient accessing power is provided to manipulate these subsets. 2. The hosts are linked by a network. All data are

2.1 Design Issues


Address correspondence to Sudha Ram, Deportment of Management Information Systems, College of Business and Public Adminirtration, University of Arizona, Tucson, AZ 85721.

design of a DDBMS usually has, at a minimum, the objective of maintaining the autonomy of local data bases. That is, the implementation of the DDBMS
The

77
The Journal of Systems and Software 10, 77-95 (1989) 0 1989 Ekevier Science Publishing Co., Inc. 0164-1212/89/$3.50

78

S. Ram and C. L. Chastain

SITE

LOCAL DBMS
<

Local Database /

Local Network DBMS Interface

Global DDBMS
c

Local Network DBMS Interface

I I

SITE 2

Local Databare k

LOCAL DBMS

Local Network DBMS Interface

SITE

Figure 1. Structure of a distributed data base system.

should neither require changes to local data, nor should any reprogramming of the local DBMS be required. In addition, many distributed data base systems seek to achieve location transparency, so that system users are not (necessarily) aware of the physical location of the data. The designer of a DDBMS also faces other considerations that are not encountered in a centralized data base system using a single DBMS: 1. Varying system configurations: When several sites are involved, it is common to encounter processors of different manufacture, with each processor possibly also utilizing a different operating system.

Communications: If system-response time is to be acceptable, and if the system is to achieve its availability goals, then a rapid and reliable communications network is critical. 3. Global facilities: Unless the system is to provide read-only capability, the issues of concurrency control will have to be addressed. Recovery from crashes due to various causes will also have to be addressed. 4. Global dictionary: A dictionary or catalogue may need to be provided so that users can determine the total data available in the system. The dictionary need not necessarily provide the actual physical location of any of the data.
2.

Architecture of Distributed Data Base Systems 5. Global directory: A directory is maintained with routing information so that global queries can be broken down into subqueries. Each subquery should be capable of being answered at one of the component data bases. The global directory contains the information needed to identify the component (local) data bases that contain data relevant to the global query. The global directory also stores data used in physically mapping the subqueries to the appropriate local data base. 2.2 Homogeneous or Heterogeneous
DDBMS

Alternate data models have appeared in the development of DBMS technology. Commercially, the three dominant data models are the hierarchical [38], the CODASYL (or network) [18], and the relational [l 11. A DDBMS may use one or more of these DBMSs and can be either homogeneous or heterogeneous [6]. Homogeneity and heterogeneity can be considered at different levels in a DDBMS, namely, the hardware, the operating system, or the local DBMS. In this paper, the important distinction is made at the level of the local DBMS. This is realistic since the same vendor offers one DBMS to run on different hardware using various operating systems. A DDBMS is homogeneous if the same DBMS occurs at each site regardless of the hardware and operating system. For instance, distributed INGRES is a homogeneous relational DDBMS because each component DBMS is, and must be, an instance of the relational DBMS named INGRES. A heterogeneous DDBMS uses at least two different DBMSs at the local sites. Whether the DDBMS is homogeneous or heterogeneous, we have to deal with two kinds of schemas-the global schema and the local schema. The global schema defines all of the data contained in the distributed data base system as if the data were not distributed at all. The local schema at each site defines the data contained at that particular site. The concept of heterogeneity introduces new complexity: if heterogeneity exists within a distributed data base system, then that systems design must provide for mapping from the structures of one DBMS to another. Such structures are specified in the DBMSs data definition language (DDL). In addition, the commands of one DBMSs data manipulation language (DML) would have to be translated to their equivalents in the DML of the other DBMS(s). One approach to heterogeneity is to carry out conversion at the local data base sites. Conversion requires that all of the constituent data bases be regenerated using the same commercial DBMS at each location so that the same data model prevails throughout the system. Conversion reduces the design problem to one of a homogeneous DDBMS. Conversion

may be practical to undertake if a majority of the local data bases already use the same data model. Altematively , the simplification in design may justify the cost of conversion in a large distributed data base system effort. Often, however, conversion is an option that cannot be undertaken. Conversion would require a large reeducation effort for strictly local database users, in addition to the education effort that will be necessary for users of the distributed system. Local software applications will have to be changed to the new data model and language. The enforcement of a central standard to the complete exclusion of already existing local conventions may be politically unacceptable. And finally, conversion is an inherent violation of the previously stated objective of maintaining local data autonomy.

3. GENERAL DDBMS ARCHITECTURES 3.1 Classification

The classification of generic types of DDBMS design requires a framework that addresses the issues of mapping and translation between different DBMSs. The main consideration is that of data integration. DDBMSs can be classified according to whether they are integrated or unintegrated. When the schemas of the local data bases remain unintegrated, user transactions must be directed explicitly to only one of the underlying data bases. The unintegrated approach is known as the multidatabase approach. This approach requires that the global schema model represent each local data base separately. The multidatabase approach thereby presents users with a common global data model and a common global DML for addressing the underlying data bases. Such a framework frees the user from having to learn the features of multiple data models and DMLs. After separately inspecting each component data base schema, the user must then identify each local data base that he or she wishes to use. The heavy analytical effort of integrating each local data base schema into one global data base schema is thus avoided, at the expense of sacrificing location transparency. Depending on the design of the individual multidatabase system, each global DML query must either: 1. Be directed by the user to one specific component data base, or 2. A global directory will route each global query to each of the local data bases to which the user has connected. In either event, the user must have already identified each individual component data base that he or she wishes to use. Note that in the multidatabase approach, a

80

S. Ram and C. L. Chastain


> GLOBAL SCHEMA RELATIONAL f \

f- GLOBAL SCHEMA

GLOBAL SCHEMA

RELATIONAL /

RELATIONAL / Figure 2. Multidatabasearchitecture.

f LOCAL SCHEMA RELATIONAL f LOCAL SCHEMA CODASM

f LOCAL SCHEMA

RELATIONAL

global query is not broken down into subqueries; a global query either goes in its entirety to only one of the connected local data bases, or it goes in its entirety to all of the connected local data bases. Figure 2 presents a heterogeneous multidatabase DDBMS example. Although multidatabase systems would typically be heterogeneous, it is important to note that the multidatabase concept could be applied to produce a homogeneous DDBMS. A homogeneous multidatabase DDBMS would offer only one advantage over maintaining completely separate local DBMSs: a single query could simultaneously address each of several local data bases that the user had specified for connection. A homogeneous multidatabase DDBMS would thereby free the user from having to redundantly enter the same query. The integration of local schemas at the global level results in the creation of one consolidated global conceptual schema. User queries directed against this schema can be decomposed so that subquery components of the original query will each be directed to just one of the underlying data bases. The subqueries will be mapped from the structures of the global data model (relations, record types) to the corresponding structures at the local data base level. The mapping are contained in a global internal schema. In similar fashion, the

commands of the global DML will be translated to the local DML equivalent. Figure 3 presents a heterogeneous example of the integrated schema approach. In this example, the data stored in the relational and CODASYL local data bases have been expressed in relational equivalents in the (one) integrated global schema. Figure 4 depicts various architectures that have been proposed for implementing the two generic DDBMS approaches of integrated schema and multidatabase [19, 211. Integrated schema may be decomposed into two subcategories: integration that produces both a conceptual and an internal schema, and integration that results in only an internal schema. The creation of both global conceptual and internal schemas is embraced by both the seven-level architecture and the distributed data base architecture. Alternatively, the global view architecture implements only an internal schema. Different multidatabase approaches have not yet appeared: there is consequently one multidatabase architecture. All of the architectures presented could be used in implementing either a homogeneous or a heterogeneous DDBMS. These architectures are described in detail in the following sections. The seven-level architecture will be presented in Section 3.2. The distributed data base architecture will be shown in Section 3.3. The global

Figure 3. Integrated schema architecture.

Architecture of Distributed Data Base Systems

81

SCHEMA ONLY

SEVEN LEVEL ARCHITECTURE \

DISTRIBUTED DATABASE ARCHITECTURE

view architecture is discussed in Section 3.4, whereas Section 3.5 presents the multidatabase architecture. Section 3.6 has comments on the architectures. 3.2 Seven-Level Architecture One of the first general architectures for distributed data bases appeared in the work of Cardenas and Pirahesh [8]. The seven-level architecture was formulated in order to specifically address the heterogeneity challenge, but it is also applicable to a homogeneous environment. Figure 5 depicts the seven-level architecture and the major mapping between its levels. The levels are described briefly as follows: Virtual model (VM): User views of the global integrated schema. This level corresponds to the subschema of the CODASYL schema, and to the external schema of the ANWSPARC architecture. Unified global conceptual model (UGCM): The integrated global schema expressed in terms of a data model. Cardenas and Pirahesh suggested use of the entity-relationship (ER) model [lo] for the global data model. Unified global internal model (UGIM): This level serves as a directory or catalogue to contain access paths from the UGCM logical elements down to their

Figure 4. Classificationof architectures. local logical equivalents in the unified local conceptual model. Unified local conceptual model (ULCM): Each local data base is represented in the global data model at this level. The UGCM is hence the integration of the ULCMs into one global schema. Unified local internal model (ULIM): A directory for mapping between the ULCM and the local logical model level. Local logical model (LLM): The local data base schema expressed in the DDL of the local data model. Local physical data base (LPDB): The actual fiie structures and indexes, etc., that comprise the local data base. The major mapping between the levels are presented as being: M6: The mapping that equates the data structures of user logical views at the VM level to their ER equivalents at the UGCM level. M5: The mapping from the UGCM level to the UGIM level. This mapping is necessary in order to retrieve the UGIMs M4 mapping data. M4: The mapping that equates a global schema logical element to its local schema logical equiva-

S. Ram and C. L. Chastain

VIRTUAL

MODEL (VM)

M6 +

UNIFIED GLOBAL CONCEPTUAL MODEL (UGCM) I UNIFIED GLOBAL INTERNAL MODEL (UGIM) + M3 A

UNIFIED LOCAL CONCEPTUAL

MODEL (ULCM)

M2 t

UNIFIED LOCAL INTERNAL

MODEL (ULIM)

M, f *

LOCAL LOGICAL MODEL (LLM)

LOCAL

PHYSICAL

DATABASE

(LPDB)

Figure 5. Seven-level architecture.

lent(s). In effect, this mapping identifies the local data bases that contain data on the object of interest. M3: A mapping from the UGIM level to the ULIM level. This mapping is done in order to retrieve the M2 mapping data. M2: The mapping that equates the local logical equivalent (at the ULCM level) to its local data model form (at the LLM level). Mapping M2 is thus analogous to M6. Ml: The mapping from the LLM level to the ULIM level. This mapping is necessary in order to retrieve M2 mapping data from the ULIM level. The seven-level architecture in effect superimposes both local and global schemas expressed in a suitable model above the local data bases and their data models ~er~c~c~, network, relational). The seven-level approach requires that a schema for each local data base be created in the data model of the global architecture. These new ULCM schemas are not merely temporary design steps in the creation of the UGCM. Each ULCM

schema is instead a permanent functional part of the seven-level architecture. Cardenas and Pirahesh proposed that the data-independent access method (DIAM) [32] be used as the mechanism to map between the levels of the arclmecture. 3.3 Distributed Data Base Architecture Figure 6 presents the dist~but~ data base ~c~~~re [7]. The global conceptual schema is a composite scheme for representing all of the data existing throughout the distributed system. The global conceptual schema may or may not represent the in~g~tion of the schemas of the underlying data bases. The distributed data base system as a whole may he either homogeneous or heterogeneous. Individual user views could also be provided. The global internal schema consists of three components: 1 The fragmentation schema: The global schema is split into several portions by means of vertical and horizontal fragmentation. Each fragment is a logical portion of the global schema. The mapping between the global schema and its fragments is defined in the ~~en~tion schema.

Architecture of Distributed Data Base Systems

83

USER VlEW i

--I

USER VIEW

GLOBAL CONCEPTUAL SCHEMA

GLOBAL INTERNAL SCHEMA

FRAGMENTATION

Figure 6. Distributed data base architecture 171.

ALLOCATION SCHEMA

I MAPPING SCHEMA I MAPPING SCHEMA I MAPPiNG SCHEMA I

2. The dkxation schema: Ehch fragment defined above may be physic~y located at one or more sites in the distributed data base system. The allocation schema defines the sites where each fragment is allocated. The physical location of data is used by the co~u~~tio~ network in foxing queries to remote data bases. 3. The mapping schema: The fragments of the global schema that are allocated to a site are mapped to the data model used by the DBMS at that site. In a heterogeneous system, different sites will need different types of mapping schemas if they use different DBMSs. The local conceptual schema is the representation of

each local data base in the DDL of the local data model. Of course, the local data bases will also each have file structures and indexes that are the equivalent of the LPDB expressed in the seven-level architecture.
3.4 Global View Architecture

The seven-level and distributed data base architectures both require the creation of a single, integrated global conceptual schema. This schema is then stored in a dictionary so that users may examine it in order to determine the system data contents. In addition, views could be created which would present seIected users with only certain subsets of the total system data contents. The global view architecture differs in that no

84

S. Ram and C. L. Chastain

USER VIEW

USER VIEW

USER VIEW

GLOBAL INTERNAL SCHEMA

r LOCAL CONCEPTUAL SCHEMA

LOCAL CONCEPTUAL SCHEMA

LOCAL CONCEPTUAL SCHEMA

Figure 7. Global view architecture.

global conceptual schema is provided for user inspection; instead, each user is provided with his or her own view of portions of the total system data. The global internal schema provides the mapping information required to link each view with the underlying data bases. Of course, an integrated global conceptual schema would effectively be created in the process of total system design; it would be necessary in order to determine what individual user views could be established. But the conceptual schema would remain a logical representation of the total system data that would be used only by the data base administration department: users would never access it. Figure 7 presents the global view architecture. 3.5 Multidatabase Architecture The multidatabase approach shares several features in common with the distributed data base architecture. As indicated in Figure 8, each local data base is represented by a local conceptual schema expressed in the DDL of the local data model. The internal schema contains mapping from the local conceptual schema to its distributed system counterpart. This counterpart, the multidatabase conceptual schema, is expressed in the DDL of the global data model. One such multidatabase conceptual schema will exist for each underlying data base. The user must therefore browse through a succes-

sion of wholly separate component data base schemas, with no location transparency. Where the local conceptual schema and its multidatabase conceptual schema share the same data model, the internal schema only maps the one-one correspondence between each local structure and its parallel multidatabase counterpart. When the local multidatabase conceptual schemas use different data models, the internal schema must map between different structural types. Note that an integrated global data base could be a component data base of a multidatabase. That is, one of the local data base schemas presented in a multidatabase system could itself be a global schema for an integrated DDBMS.

3.6 Comparison of the Architectures Figure 9 presents the major components of each of the general architectures. Their similarities include the fact that each architecture posits the use of individual user views. In addition, each architecture requires creation of an internal schema. The internal schema contains the information required to map bebeen data representations in different data models. The major differences include the following: 1. Only the seven-level and distributed data base architectures include integrated global conceptual schemas as functional components of the distributed system.

~c~t~~re

of Dist~but~

Data Base Systems

MULTI DATABASE CONCEPTUAL SCHEMA

r INTERNAL SCHEMA INTERNAL SCHEMA

I
INTERNAL SCHEMA

Figure 8. Multidatabase architecture.

LOCAL CONCEPTUAL SCHEMA

LOCAL CONCEPTUAL SCHEMA

LOCAL CONCEPTUAL SCHEMA

LOCAL DATA BASE

2. Only the global view architecture absolutely requires that individual views be created. 3. Only the seven-level architecture maintains both global and local conceptual schemas in the global data model. The distributed data base retains only the global conceptual schema, whereas multidatabase designs include only local conceptual schemas in the global data model. The global view approach retains neither local nor global conceptual schemas in the global data model. The seven-level architectures usage of intermediate local schemas expressed in the global data model appears to increased system mapping requirements by a factor of two. In fact, however, the mapping requirements of the architectures are the same. This is so in that the M4 mapping of the seven-level architectures UGIM level is equivalent to the fragmentation and allocation schemas of the other architectures global internal schemas. In similar fashion, the M2 mapping of the

ULIM level is equivalent to the mapping schema component of the internal schemas. The mapping requirements of the architectures are therefore equivalent. Note that user views set up in the seven-level, distributed data base, and multidatabase architectures could also potentially involve heterogeneity. For instance, a hier~chic~ view might be established for a user, even dough the global data model is relations. The use of heterogeneous views would provide the appearance of multiple global data models being offered in the distributed system. Such an approach would also considerably increase system complexity in that an additional mapping would be required between the data model used in the user view, and the global data model used in the global conceptual schema. The difficult task of mapping and t~slation between heterogeneous data models is isolated in all of the architectures at the top-most level (user views) and the bottom-most level (linkage to local data bases). The seven-level concept requires that, in addition to the

86

S. Ram and C. L. Chastain


DISTRIBUTED DATABASE ARCHITECTURE GLOBAL VIEW ARCHITECTURE MULTI DATABASE ARCHITECTURE

SEVEN LEVEL ARCHITECTURE

VM

USER VIEWS -

USER VIEW

4-UGIM -

GLOBAL INTERNAL SCHEMA

GLOBAL INTERNAL SCHEMA

INTERNAL SCHEMA

Ic

MULTI DATABASE CONCEPTUAL SCHEMA

UGIM

LLM

LOCAL CONCEPTUAL SCHEMA

LOCAL CONCEPTUAL SCHEMA

LOCAL DATA BASE

Figure 9. Comparison of architectures,

view architectures, however, such m~i~cations would require changes at the global internal schema level itself. 4. SURVEY OF CURRENT DDBMS ARCHITECTURE
The available literature on current distributed data base systems was surveyed in order to identify actual implementations of the various general DDBMS architectures. Table 1 presents a summary of the findings for sixteen different systems. Note that the systems surveyed represent a broad range of development: some are being prototyped at research centers, whereas others are in the industrial testbed stage or even commercially available.

global conceptual schema, a schema for each local data base be created and maintained in the DDL of the selected global data model. This means that the sevenlevel architecture requires greater overhead than the other architectures. The seven-level architecture gains an advantage in maintaining these additional schemas. When routine modifications (updates, deletions, insertions, or data r~rg~ation) are made to existing data types (CODASYL record types, relation attributes, etc.), the modifications impact the seven-level architecture only at the local data base level and at the IJLIM (mapping) level. For the distributed data base or global

Architecture of Distributed Data Base Systems


Tabk 1. Survey of Dist~buted Data Base Systems Management Systems

87

COSYS

CSIN

DDTS -~-

Distributed INGRES

DPLS

MESSIDOR

1. Classification Integrated schema or rnulti~~~e 2. Architecture A. Integrated schema architecture B. Homogeneous or he~~gen~s C. Global schema representation D. Global data model E. Local data model

Integrated schema Seven level Heterogeneous Data semantics model Relational No discussion

Multidatabase

Integrated schema
Seven level

Integrated schema Global view Homogeneous

Integrated schema Distributed database Heterogeneous DPLS data model

Multidatabase

Heterogeneous Relational

Heterogeneous Entitycategory relationship Relational IDS-II (Network)

Heterogeneous Relational

Relational No discussion

Relational INGRES

DPSL data model No discussion Discussion

Relational NTIS, INSPEC, TELEDGC, etc. Yes

F. Global user views

Y6-S

No discussion

No discussion

No discussion

No

MULTIBASE

MUQUAPOL

POLYPHEME

POREL

PRECI*

PROTEUS

SDD-1

1. Classification Integrated schema or rn~ti~ba~ 2. Architecture A. Integrated schema architecture B. Homogeneous or heterogeneous C. Global schema representation D. Global data model E. Local data model Integrated schema Multidatabase Integrated schema Integrated schema Integrated schema

schema

Integrated schema

Seven level Heterogeneous Heterogeneous

Seven level Heterogeneous

Distributed data base Homogeneous

Seven level Heterogeneous

Distributed data base Heterogeneous

Distributed data base Homogeneous

Functional data model Functional data model No discussion

Relational

Relational

Relational

Relational

ER variant

Relational

Relational POLYPHEME DDBMS

Relational 1. URANUS (Relational) 2. SOCRATE (Network) No discussion

Relational Relational

Relational No discussion

Relational No discussion

Relational

F. Global user views

Yes

No discussion

Not certain

No discussion

No discussion

No discussion

SIRIUS-DELTA 1. Ciassiflcation Integrated schema or muRidatabase 2. Architecture A. Integrated schema architecture B. Homogeneousor heterogeneous C. Global schema representation D. Global data model E. Local data model F. Global user views System R* XNDM

Integrated schema Seven level Heterogeneous Reiational Relational 1. MRDS (Relational) 2. PHLOX (Network) Yes

Integrated schema Global view Homogeneous Relational System R (Relational)

Integrated schema Distributed database Heterogeneous Relational Relational 1. MRDS (Relational) 2. IDS-II (Network) No

88 4.1 Integrated Schema Thirteen of the distributed systems were categorized as being integrated schema designs. The integrated schema designs are further categorized in following paragraphs. Seven-level architecture. Six of the integrated schema designs were examples of the seven-level architecture: COSYS: The cooperation system, or COSYS, was developed at the University of Grenoble [3] COSYS uses the data semantics model [l], an example of the binary relational data model, for its global schema description. There are three main components of COSYS: 1. A local views processor that generates a data semantics model schema for each local data base. Each such schema is an example of the ULCM level. 2. A global views processor that manages everything concerning the global view (global conceptual schema). The global views processor thus represents the UGCM and ULIM levels. 3. The global programs processor that conducts the processes of global query optimization and decomposition. COSYS is a heterogeneous DDBMS. DDTS: The Honeywell distributed data base testbed system (DDTS) is a testbed effort whose goals are to emphasize architectural simplicity and efficiency [ 171. The architecture of DDTS consists of the following: 1. A conceptual schema, which is an integrated global conceptual schema, that semantically describes the total data content of the DDBMS. The entitycategory-relationship (ECR) model [39] is used as the global data model for semantic purposes in representation schema that serves to syntactically describe total system data content. The relational model is used for this purpose. The conceptual and global representation schemas thus correspond to the UGCM level of the seven-level architecture. 3. External schemas or user views of the global conceptual schema. These views correspond to the VM level of the seven-level architecture. 4. One local representation schema is created for each local data base. The local representation schema is a relational model syntactical description of each local data base. The local representation schema hence correspond to the ULCM level of the seven-level architecture. 5. A local internal schema exists for each local data
DDTS . 2. A global

S. Ram and C. L. Chastain base. The local internal schemas use the data model of the local DBMS. The local internal schemas thus correspond to the LLM of the seven-level architecture. DDTS has been implemented on three Honeywell Level6 minicomputers. Communications are via a bus local area network. User requests are submitted in GORDAS, a DML which has been defined for use with the ECR model. Each GORDAS command is checked against the conceptual schema and translated into the relational terms used by the global and local representation schemas. A stated objective of DDTS is to experiment with global concurrency control, utilizing both global locking and time stamping. DDTS is a heterogeneous DDBMS. Multibase: The Computer Corporation of America has developed a prototype DDBMS, called MULTIBASE, under grants from the Department of Defense [25]. The major architectural components of MULTIBASE are as follows: 1. A global schema that provides an integrated global conceptual schema for the entire data content of the distributed system. The global schema uses DAPLEX, an implementation of the functional data model [33], as its global data model. The global schema corresponds to the UGCM level. 2. A DAPLEX local schema is established for each of the local data bases in the distributed system. The DAPLEX local schemas of MULTIBASE equate to the ULCM level. 3. A local host schema exists for each local data base. Each local host schema is defined in the data model used by the local DBMS. The local host schemas thus correspond to the LLM level. 4. view derivations in MULTIBASE represent the stored rules that enable MULTIBASE to transform a DAPLEX query directed against the global schema into subsets directed to appropriate DAPLEX local schemas. The view derivations equate to the UGIM level. The global data manager handles all of the activities of MULTIBASE at the global level, to include the optimization and decomposition of queries. The global data manager also establishes and maintains an auxiliary data base for handling data incompatibilities between the DAPLEX local schema representation of data and the global schema representation of the same data. A local data base interface component is created for each local data base in order to interface between the DAPLEX local schema and its corresponding local host schema. The local data base interface is responsible for translat-

Architecture of Distributed Data Base Systems ing the DAPLEX single-site query that arrives at the DAPLEX local schema into its local host schema DML equivalent. The local data base interface also includes a local optimizer that seeks to formulate the local query that can be most rapidly processed by the local DBMS. MULTIBASE provides read-only access to local data bases; no update capability is provided. MULTIBASE is a heterogeneous DDBMS. Polypheme: Honeywell-Bull POLYPHEME system [2]. The the following: The University of Grenoble and the CIIResearch Center have produced the prototype for a distributed data base architecture of POLYPHEME includes

89 3. A participatory schema is established for each local data base. The participatory schema describes the data of the local data base in terms of the relational model. The participatory schema thus equates to the ULCM level. 4. Local nodal data bases, each one using the data model of the local DBMS. The schema of the nodal data base thus serves as the LLM level. PRECI* assumes that a subsidiary data base may need to be created for each node in order to contain integration data necessary to fully convert from the form of the local nodal data base to the relational form of the participatory schema. A component called the global query preprocessor performs the tasks of optimization and decomposition on global queries, and prepares an overall execution plan. The global subquery preprocessor then transforms subqueries into (local) nodal queries directed to either a local nodal data base or the subsidiary data base associated with a nodal data base. The global subquery preprocessor must also translate subqueries to the DML of the nodal data base. Actual execution of global queries is performed by the global query executor, and of subqueries by the global subquery executor. PRECI* is a heterogeneous DDBMS. Sirius-Delta: The french governments LInstitut National de Recherche en Informatique et Automatique (INRIA) has sponsored the SIRIUS-DELTA system. SIRIUS-DELTA is a preindustrial prototype intended to culminate in a general purpose commercial DDBMS [26]. The architecture of SIRIUS-DELTA consists of the following: 1. A global conceptual schema that provides an integrated global conceptual schema. The global conceptual schema corresponds to the UGCM level. The relational data model is used as the global data model. 2. Global external views of the global external schema may be established for individual users. The global external views thus equate to the VM level. 3. A global internal schema maps from the global conceptual schema to the various local external views. The global internal schema corresponds to the UGIM level. 4. A local external view is created for each local data base. The local external view is a relational schema representation of the local data base. As such, the local external view is an example of the ULCM level. 5. A local conceptual schema exists for each local data base and is expressed in the data model used locally. The local conceptual schema corresponds to the LLM level.

1. A global machine that serves as both an integrated global conceptual schema and a mapping mechanism. The global machine thus corresponds to both the UGCM and UGIM levels. The relational model is used for the global data model. 2. One local machine is established for each local data base. The local machines provide the functions of the ULCM and ULIM levels. The relational model is used for each local machine. 3. Local data bases whose schema represent the LLM level. POLYPHEME employs URANUS as its relational system. Local data bases are either implementations of URANUS, or of the SOCRATE network DMBS. The global execution monitor of the global machine handles activities between the global and local machines. Examples of global execution monitor functions include decomposition and optimization of global queries, and reintegration of results received from the local data bases. A local execution monitor exists for each local machine. The local execution monitor receives subqueries, translates them to the local DML format, and receives responses for transmission to the global machine. POLYPHEME leaves concurrency control to local DBMS facilities. POLYPHEME is a heterogeneous DDBMS . PRECI*: The University of Aberdeen has undertaken a distributed data base research prototype called the prototype of a relational canonical interface, or PRECI* [15]. The architecture of PRECI* consists of the following: 1. A global data base schema that serves as an integrated conceptual schema for the distributed system. The global data base schema serves as the UGCM level. The relational model is used as the global data model. 2. Global external schemas that provide individual user views, and hence conform to the VM level.

90 SILOE is the SIRIUS-DELTA component that handles data distribution and provides for the execution of global queries. Like all of the components of SIRHJSDELTA, SILOE exists at both the global and local levels. Global SILOE performs d~m~sition and op~tion of queries. Local SILOE performs translation between the local external view and the local conceptual schema. The global distributed execution component of SIRIUS-DELTA is called global SER. As such, SER synchronizes execution of the plan created by SHOE. Each local data base has a local version of SER; the local SERs control execution of the subqueries to be run locally. The SCORE component of SIRIUS-DELTA provides global concurrency control. Two-phase locking is used as the concurrency control mechanism. The SIRIUS-DELTA prototype has been implemented on ~cr~ompute~ using the PHLOX network DBMS at the local data base level. SIRIUS-DELTA is a heterogeneous DDBMS. Distributed data base architecture. Five of the ~tegrat~ schema designs were of the dis~bu~ data base architecture. DPLS: The Data Base Program Base and Language Base Support project, or DPLS, is a joint effort of the Nibon Systemix Corporation and the University of Tsukuba [37]. The DPLS data dictionary data base contains the data associated with both the global conceptual and global internal schemas. The global data model is the DPLS data model [37], a variant of the ER model. The front analyzer component performs the tasks of global query optimization and decomposition. The data translator com~nent conducts the function of accessing mapping data to actually link subqueries to local data bases. DPLS permits only one data base to be updated at a time by a single transaction. DPLS creates system redundancy by storing copies of the data dictionary data base at different points throughout the system. POREL: The POREL system has been developed at the University of Stuttgart as a research effort in distributed data base technology [30]. POREL is a homogeneous DDBMS in which each local data base and the global data base all utilize the same relational data model. The POREL design seeks to emphasize distributed, rather than centralized, control of transaction execution, authorization, recovery, and data storage. A global query is first translated from the relational DML to relational algebraic language. The network oriented analysis component then accesses catalog data equivalent to that of the global conceptual and global internal schemas in order to validate and decompose the global query. The catalogs are managed by the catalog man-

S. Ram and C. L. Chastain ager. Catalog information on data location and access is only kept at the local data bases where the data is actually stored. Optimization takes place using an algorithm that seeks to minimize communications traffic. The transaction manager component supervises execution of the transactions, routing them to the local data bases, or local relational base machines. Each local relational base machine then executes the subquery which reached it, and returns query responses to the transaction manager. POREL utilizes the X.25 packet switching protocol for network co~unications. Locking is used at the global level as the concurrency control mechanism. PROTEUS: The Science and Engineering Research Council of the United Kingdom has funded a multi~versi~ project, PRO~US, to conduct research in distributed data base technology [35]. PROTEUS employs a global conceptual schema called the abstracted conceptual schema. The global data model of PROTEUS appears to be an ER variant [35]. The global DML uses relational algebra for its standard operations. The choice of global model and language was made in order to facilitate the integration of either relational or network local data bases. PROTEUS maintains a single central node with a directory that contains system mapping data. The central directory is thus equivalent to the global internal schema. The other components of the central node are as fohows: 1. A query decomposer for breaking global queries into appropriate local subqueries. 2. A composer for reintegrating subquery responses. 3. A scheduler that monitors, receives, and dispatches all of the subquery com~nents of a transaction and the subquery responses. SDD-Z: The Computer Corporation of America has developed the SDD- 1 ~s~buted system [3 13. The major architectural components of SDD-1 are as follows:
1. Data modules that manage all of the data of the

system. Each data module is in fact a local data base presided over by its DBMS. 2. Transaction mod&s that plan and control the dist~bu~d execution of tractions. The transaction module performs the global functions of decomposition, plan optimization and generation, and subsequent execution of the plan. The transaction module(s) contain the information associated with the global con~p~ schema and global internal schema of the distributed data base architecture. The same reIational model is used at both the data module and transaction module level: SDD-1 is therefore

Architecture of Distributed Data Base Systems a homogeneous DDBMS. Strong global concurrency control is enforced in SDD-1 by means of time stampbased protocols. The protocols are applied selectively to incoming transaction, This is made possible by conducting extensive analysis of anticipated transactions during the systems initial design phase. The identified transactions are grouped into classes based on the perceived requirement for global concurrency control. For instance, two update tr~~ctions that involve conflicting READS and WRITES may require a protocol that involves considerable synchronization. A read-only transaction, however, may only require a protocol involving less synchronization, Multiple copies of the global directories (equivalent to the global conceptual and global internal schemas) are stored at transaction modules throughout the system. A reliable, packetswitched communications network is used by SDD- 1. The combination of distributed concurrency control and reliable communications are intended to make SDD-1 an architecture with high redundancy and reliability. XNDM: The National Bureau of Standards in conjunction with the U.S. Air Force sponsored the XNDM dis~but~ data base research project [22]. The network virtual schema serves as the XNDM equivalent of the global conceptual schema [22]. The global data model is relational. The user protocol interpreter determines which local data bases are capable of ~f~ling the data requirements of the transaction; the user protocol interpreter thus contains data that corresponds to the global internal schema. The user protocol interpreter also performs the function of query optimization. Each local data base has a server protocol interpreter that translates subqueries to the language of the local DBMS and also receives responses from the local DBMS, Concurrency control in XNDM is left to the local DBMSs. XNDM has been implements on a PDP 1l/45 processor. Local data bases have been the Honeywell integrated data store, a network system, and the Multics relational system. Global view architecture. Two of the integrated schema systems are examples of the global view architecture. D~trib~ted IAGRES: Distributed INGRES is a development of the University of California at Berkeley and Relational Technology, Inc. [36]. Distributed INGRES is designed so that each component data base must be an implementation of the INGRES Relational DBMS. Distributed INGRES is therefore a homogeneous system. A distributed INGRES data base is created by having an authorized user submit the appropriate DDL commands at one of the distributed sites. The

91 command must name every local data base site that will participate in the DDBMS. As the creation command executes, a collection of system catalogs are created. New relations can then be created and pa~ition~, if necessary, among sites according to appropriate selection information. When a query is submitted to a distributed INGRESS DDBMS, the inception site is considered to be the master INGRES process for that query. The master INGRES process parses the command, checks catalog info~ation on data existence and location, and creates an optimized execution plan. Subcomponents of the plan are then forwarded to each local INGRES data base for execution. The local data base concurrency control mechanisms (locking) are used in distributed INGRES. A two-phase commit protocol is used to synchronize updates of data.

System R *: System R* is an experimental distributed

data base system developed by the IBM corporation [40]. Each local data base System R* must be an instance of a System R relational data base. System R* is therefore a homogeneous system. The architecture of System R* emphasizes local site autonomy. This objective is achieved by avoiding global and centralized data and control structures. Instead of a global schema, each local data base in System R* maintains a catalog of its own data, as well as the data stored at other data bases that local users of the distributed system are authorized to access. This catalog is in addition to the catalog that each local data base maintains for strictly local purposes. It is possible that data will be fragmented so that data initially created at one site will be stored elsewhere. This is provided for in System R* by having the catalog at the birth site of a data object keep info~ation on where the object is currently stored, In this manner, at most two accesses must be made in System R*: one from a usertransaction node to the birth-site node of a data object and the second from the birth-site node to the currentstorage node. Queries in System R* are first parsed and then checked against the local-site catalog. If different sites are involved, then catalog information must be fetched from those sites. Each local data base also contains all of the controls for accessing locally stored data. Following the completion of user authorization, an execution plan will be generated at the original query site. Subqueries are then sent to destination data bases. The access path selector component at each local System R data base will generate local execution strategies and the relevant local subquery code. System R* uses the local locking concurrency-control mechanisms of each System R data base. The two-phase commit protocol is used to insure uniform transaction commitment/abortion,

92

S. Ram and C. L. Chastain


4.3 Comments on Current Designs

4.2 Multidatabase

Three of the systems surveyed were multidatabase architecture designs. CSZN: The Chemical Substances Information Network, or CSIN, is a product of the Computer Corporation of America [9]. CSIN provides access to local data bases which are repositories of chemical information. The global dictionary, or schema, is called the information resources directory. The global data model is relational. CSIN provides read-only capability to its component data bases: no updates are allowed through the distributed system. MESSZDOR: MESSIDOR is a prototype multidatabase system that has been sponsored by the French government agency INRIA [29]. MESSIDOR presents the user with a global schema that contains the data contents of the various local data bases that comprise the system. The global data model is relational. Each local data bases data contents are presented separately in the MESSIDOR global scheme. There is no location transparency, and the global schema is not integrated. In MESSIDOR, each local data base is a bibliographic data base that contains data pertaining to documents. After the user has selected the pertinent local data bases, MESSIDOR will log in to each of the selected component data bases. The user may then initiate transactions in the DML of MESSIDOR. Each global transaction will automatically be translated to each of the local data bases, and the translated query will be routed to the correct connected local data base. In this manner the user having once identified the destination data base, need not further identify particular target data bases for each global transaction. Instead the entire global transaction, once translated, will be routed in its entirety to each of the connected data bases. Results from each data base are received and reformatted into MESSIDORs DML. The final results are then displayed. The results also clearly indicate the extent to which each connected data base contributed to the response.
MUQUAPOL: Another INRIA project is the MUQUAPOL system [26]. MUQUAPOL is a multidatabase system whose local data bases are all implementations of the POLYPHEME DDBMS. The MUQUAPOL global schema therefore contains the global schemas of one or more POLYPHEME distributed systems. The user then selects the POLYPHEME global machines that he or she wishes to connect to. The global data model of MUQUAPOL is relational. The DML is an extension of the QUEL relational language.

In reviewing the architecture of the sixteen systems, several observations can be made:
1. At present,

2.

3.

4.

5. 6.

the integrated schema architecture is predominant over the multidatabase architecture. Much of the current multidatabase work is being done in Europe. No one of the three subcategories of seven-level, distributed data base, or global view has emerged as the preferred design when building an integrated schema DDBMS . The difficulties of heterogeneity are already a focus for research effort in that twelve of the sixteen designs are heterogeneous. Where the DDBMS is to be homogeneous, commercial designs have preferred the global view architecture. The relational model is used as the global data model in thirteen of the designs. The ten integrated schema designs which were classified as being either seven-level or distributed data base architectures varied in their choice of a data model to represent the global schema. Five of the designs used the relational model. One design used the binary relational model. Three of the designs used ER variant models. One design used the functional data model.

These observations will be expanded upon in the next section.

5. DIRECTIONS FOR CONTINUED EFFORT

The integrated schema concept is the dominant architectural approach to distributed data base design. Given the pervasive interest in data base systems, the current level of data base research is not surprising. What is surprising is the relative scarcity of implemented systems. The sheer magnitude and complexity of the analysis effort that accompanies any large-scale data base project is one factor. When the effort also involves distribution and heterogeneity, the designers task can easily be perceived as being overwhelming. Nor is the literature on the subject noteworthy for its clarity. Indeed, it is easily possible to arrive at conflicting interpretations of the architecture of a given DDBMS. One of the more prominent distributed efforts, SIRlUS-DELTA, is a case in point. Because SIRIUS-DELTA provides a single, integrated schema, it has often been described in terms which this paper has reserved for the distributed data base architecture. Other commentators, noting that each local data base schema has also been expressed in the global data model, have interpreted SIRIUS-DELTA to

Architecture of Dist~but~

Data Base Systems

93 having some features common to such commercial models as the network and relational, and therefore being more readily converted to/from the various local data models. The ER model (and its variants) and the functional data model have been proposed for the neutral model. A comparative analysis of ~plementing schema conversion via the various conversion ~go~~s would be a useful exercise in assessing both the relative merits of the algorithms and also in evaluating the merits of the neutral model approach.
5.2 Integration into a Global Schema

be more consistent with the multidatabase approach. Similarly, the architecture of a homogeneous multidatabase is similar to the global view architecture. The advancement of a standard for evaluating and classifying DDBMSs would obviously facilitate greater understanding. Sickly , the analysis and design tasks required to produce an integrated schema distributed data base system require elaboration. The major tasks are as follows:
l l l

Local Schema Conversion Integration into a Global Schema Translation of DML Commands

The three steps are briefly discussed below.


5.1 Local Schema Conversion

As a necessary precursor to the analytical task of creating an integrated global schema, each local schema needs to be represented in the terms of the global data model. Considerable effort has already occurred in the attempt to produce algorithmic approaches to schema conversion. Zaniolo 1421 has produced ~go~~rns for converting from the network data model to the relational data model. Kuck and Sagiv 1241 have undertaken the converse task of providing procedures which convert a relational schema to a network schema. Wong and Katz [41] have proposed a set of algorithms to provide for both conversion of relational to network and vice versa. Johnson et al. [20] have similarly proposed a scheme for doing network-relational conversion in either direction. Klug [23] has worked on the case of converting from either hierarchical or network to relational form. Smith et al. [34] have augmented the MULTIBASE project by developing a.lgorithms for converting from the network and relational models to the functional data model (DAPLEX) used by MULTIBASE. Motro has developed algorithms for converting schemas from his own data model, the abstract data model, to (and from) the network and relational data models [28]. Finally, Dumpala and Arora [16] have produced metrology for converting from network and relational models to the ER model. A key factor in considering conversion is what data model to use at the global level. Although the relational model is the most popular, a substantial school of thought believes that a neutral model is preferable for use in describing the global schema [4, 25, 411. This school of thought is reflected in the DDBMSs of Table 1 which employ the relational data model as their global data model, but use another model for describing the global schema. Examples of this approach are COSYS, DDTS, and PROTEUS. Such a model is posited as

Conversion of each local schema to its representation in the global data model makes possible the next task: integration of the individu~ schemas into a single global schema. This major analytical task involves the concepts of aggregation, or the treatment of objects as subsets of other objects, and generalization, or the grouping of similar objects into one generic object. This area has received rather less research than schema conversion. Motro and Buneman [28] have developed an algorithmic approach to schema integration Dayal and Hwang [12] have focused on data compatib~i~ in their work done for the MULT~ASE system. Mannino 1271 haa proposed a complete design methodology that includes algorithms for designing the global schema and determining the equivalence of items that have been generalized. The successful testing of these algorithms would be a significant advance in creating confidence in whether commercial organizations could successfully undertake a large-scale integration effort.

5.3 Translation of DML Commands This area has perhaps received the least effort of all. Even if an integrated schema complete with mapping to each local schema has been created, it will still be necessary for a gIobal (Iocal) DML command to be converted to its local (global) DML version in order to apply the desired operations on the in~cat~ schema contents. Kuck and Sagiv [24] have provided a very general algorithm for relational to network conversion. Several questions remained unresolved yet. Given a set of preexisting query languages particularly in the case of heterogeneous DDBMSs, should the distributed database query language be an existing one, or a new language designed to have features of the existing ones? The choice of query language will undoub~y have an effect on the query translation. This area represents a particularly worthwhile area for research. In conclusion, we have attempted to clarify the various architectures for distributed data base systems

94

S. Ram and C. L. Chastain


215 (1985). 16. S. Dumpala and S. Arora, Schema translation using the entity-relationship approach, in Entity-Relationship Approach to Information Modeling and Analysis, (P. Chen, ed.), Entity Relationship Institute, pp. 339-360, 1981. 17. R. Ehuasri, C. Devor, and S. Rahimi, Notes on DDTS: An Approach for Experimental Research in Distributed Data Base Systems, ACMSIGMOD Record, pp. 33-49, (July 1981). 18. R. Frank and R. Taylor, CODASYL Data Base Management Systems, ACM Comput. Surv. 8(l), 209-215 (1976). 19. V. Gligor and E. Fong, Distributed Data Base Management Systems: An Architectural Perspective, J. Telecom-

and reviewed several major DDBMSs by classifying them according to these architectures. The topic of DDBMS is yet to mature. Many of the technical issues in the design of DDBMSs (such as concurrency control, reliability) have been analyzed well and techniques designed to solve them. However, several issues still
remain to be researched, performance. particularly in the area of

REFERENCES
1. J. R. Abrial, Data Semantics, IFIP TC2 Working Group, Cargese, April (1974). 2. M. Adiba, J. M. Andrade, P. Decitre, F. Femandez, and N. G. Toan, PGLYPHEME: An experience in distributed data base system design and implementation, in Dbtrib#ted Data Bases. (C. Delobel and W. Litwin, eds.), North-Holland Publishing Co., pp. 67-84, 1980. 3. M. Adiba and D. Portal, A Cooperative System for Heterogeneous Data Base Management Systems, Infor-

mun. Networks 2(3), 249-270 (1983).


20. H. Johnson, J. Larson, and J. Lawrence, Transformations Between Network and Relational Data Models, Sperry Univac Research Report TMAO0729, Roseville, Minnesota, March 1978. 21. R. H. Katz, Software Architectures for Heterogeneous Data Base Management, Proceedings of the IEEE COMPSAC 81, pp. 3341 (1981). 22. S. R. Kimbleton, P. Wang and E. Fong, XNDM: An Experimental Network Data Manager, Proceedings on

mation Systems 3, 209-215 (1978). 4. V. DAppollonio, A. Fuggetta, P. Lazzarini, M. Negri,


and G. Pelagatti, The Integration of the Network and Relational Approaches in a DBMS, British National Conference on Data, pp. 176-199 (1984). G. Belford, Distributed data base techniques: an assessment, in Current Directions in Data Base Development. Report No. REV-24-01-14, Auerbach Publishers Inc., New York, pp. 1-12, 1984. 0. H. Bray, Dktributed Data Base Management Systems, Lexington Books, Lexington, Massachusetts, 1982. S. Ceri and G. Pelagatti, Distributed Data Bases: Principles and Systems, McGraw Hill Inc., New York, 1984. A. Cardenas and M. Pirahesh, Data Base Communication in a Heterogeneous Data Base Management System Network, Information Systems 5, 55-79 (1980). Computer Corporation of America, Overview of the Chemical Substances Information Network, Technical Report No. CCA-81-03, April 1981. P. Chen, The Entity Relationship Model: Towards a Unified View of Data, ACM Trans. Data Base Syst. l(l), 9-36 (1976). E. F. Codd, A Relational Model for Large Shared Data Banks, Commun. ACM 13(6), 377-387 (1970). U. Dayal and H. Hwang, View Definition and Generalization for Data Base Integration in MULTIBASE: A System for Heterogeneous Distributed Databases, Proceedings of

the Fourth Berkeley Conference on Distributed Data Management and Computer Networks, Lawrence
Berkeley Laboratory, University of California, California, 1979. 23. A. Klug, Multiple View, Multiple Data Model Support in System DM*, Department of Computer Science Technical Report, University of Wisconsin, Madison, Wisconsin, 1981. 24. S. Kuck and Y. Sagiv, A Universal Relational Data Base System Implemented Via the Network Model, ACM

5.

6. 7.

8.

SIGMOD International Conference on Management of Data, pp. 147-157 (1982).


25. T. Landers and R. Rosenberg, An overview of MULTIBASE, in DistributedData Bases, (Schneider, J. H. ed.), North Holland Publishing Company, pp. 153-183, 1982. 26. W. Litwin, J. Boudenant, C. Esculier, A. Ferrier, A. M. Glorieux, J. L. Chima, K. Kabbaj, C. Moulinoux, P. Rolin, and C. Stangran, SIRIUS systems for distributed data management, in Distributed Data Bases, (Schneider, H. J. ed.), North-Holland Publishing Co., pp. 31 l-366, 1982. 27. M. Mannino, A Methodology for Global Schema Design, Unpublished Ph.D. Dissertation, Department of Management Information Systems, College of Business Administration, University of Arizona, Arizona, 1983. 28. A. Motro and P. Buneman, Constructing Superviews,

9.

10.

11. 12.

the 6th Berkeley Workshop on Distributed Data Management and Data Communications, pp. 203-238
(February 1982). 13. C. J. Date, An Introduction to Data Base Systems, Addison-Wesley, Reading, Massachusetts, 1986. 14. R. Davenport, Design of Distributed Data Base Systems, Computer J. 24(l), 31-41 (1981). 15. S. M. Deen, R. R. Amin, G. 0. Ofori-Dwumfuo, and M. C. Taylor, The Architecture of a Generalized Distributed Data Base System-PRECI*, Computer J. 28(3), 209-

ACM-SIGMOD International Conference on Management of Data, pp. 54-64, April (1981).


29. C. Moulinoux, J. C. Faure, and W. Lihvin, MESSIDGR: A Distributed Information Retrieval System, ACM and

BCS Research and Development Conference in Information Retrieval, Berlin, pp. 18--20, May (1982).
30. E. Neuhold and B. Walter, An overview of the architecture of the distributed data base system PGREL, in Distributed Data Bases, (Schneider, J. H. ed.), North-

Architecture of Distributed Data Base Systems


Holland Publishing Co. pp. 247-290, 1982. 31. J. B. Rothnie, P. A. Bernstein, S. Fox, N. Goodman, M. Hammer, T. A. Landers, C. Reeve, D. W. Shipman, and E. Wong, Introduction to a System for Distributed Databases (SDD-l), Trans. Data Base Syst. 5(l), 1-17 (1980). 32. M. Senko, E. Altman, M. Astrahan, and P. Fehder, Data Structures and Accessing in Data Base Systems, IBM Syst. J. 12(l), 11-25 (1973). 33. D. W. Shipman, The Functional Data Model and the Data Language DAPLEX, ACM Trans. Data Base Syst. 3(l), 140-173 (1981). 34. J. M. Smith, P. A. Bernstein, U. Dayal, N. Goodman, T. A. Landers, W. Lin, and E. Wong, MULTIBASEIntegrating Heterogeneous Distributed Data Base Systems, Proceedings AIFPS National Computer Conference, 50, pp. 487-499 (1981). 35. P. M. Stocker, M. P. Atkinson, P. M. D. Gray, W. A, Gray, E. A. Oxborrow, M. R. Shase, and R. G. Johnson, Proteus: A heterogeneous distributed data base project, in Data Bases-Role and Structure, (P. M. D. Gray, and M. P. Atkinson, eds.), Cambridge University Press, pp. 125-149, 1984. 36. M. Stonebraker, The design and implementation of distributed INGRES, in The INGRES Papers: Anatomy of a Relational Data Base System, (Stonebraker, M.

95
ed), Addison-Wesley, pp. 187-196, 1986. 37. M. Tsubaki and R. Hotaka, Distributed multidatabase environment with a supervisory data dictionary data base, in Entity-Relationship Approach to System Analysis and Design, (P. Chen, ed.), North-Holland Publishing Company, pp. 625-646, 1980. 38. D. Tsichritzis and F. Lochovsky, Hierarchical Data Base Management, ACM Comput. Surv. 8(l), 105-123 (1976). 39. A. Weeldreyer, Structural Aspects of the Entity-Category Relationship Model of Data, Honeywell Report, HR-80249, Honeywell Corporate Computer Science Center, Bloomington, Minnesota, 1980. 40. R. Williams, D. Daniels, L. Haas, G. Lapis, B. Lindsay, P. Ng, R. Obermarck, P. Selinger, A. Walker, P. Wihns, and P. Yost, R*: An overview of the architecture, in Improving Data Base Usability and Responsiveness, (P. Scheurmann, ed.), Academic Press, pp. l-27, 1982. 41. E. Wong and R. Katz, Logical design and schema conversion for relational and DBTG data bases, in The INGRES Papers: Anatomy of a Relational Data Base System, (M. Stonebraker, ed.), Addison-Wesley, pp. 187-196, 1986. 42. C. Zaniolo, Design of relational views over network schemas. ACM-SZGMOD International Conference on Management of Data, pp. 179-190 (1979).

You might also like