You are on page 1of 8

Scalable Simulation Campaign Managament on the Grid

Iyad Alshabani, Luiz-Angelo Steffenel and Olivier Flauzac


Syscom - CReSTIC Universit de Reims Champagne-Ardennes e BP 1039, F-51687 Reims Cedex 2, France

AbstractSimulation is a classical approach for scientically evaluating the quality of competing designs and solutions. While the Grid computing [8] has has become a widespread way to aggregate resources, developing simulation experiments on a grid environment still relies on ad-hoc solutions and no generic tool for preparing and deploying simulation experiments has been provided up to now. Our aim is to provide this generic environment through the association of OGSA key concepts such as Grid Service and Data service and a distributed computing middleware. For this purpose, we implement the concept of C ONFIIT service which will allow to make service oriented applications, making use of the C ONFIIT middleware. The C ONFIIT Service allows the virtualizing of the computational resources provided by C ONFIIT middleware and expose it as transparent Experiment Management Service for the simulation community.

I. I NTRODUCTION Simulation is a classical approach for scientically evaluating the quality of competing designs and solutions. It offers complete control over experimental conditions, allowing reproducible experiments by which one can compare competing designs and solutions. The widespread use of simulation in several scientic communities has led to accepted tools and methodologies, for example in the areas of microprocessor design and network protocol design. The emergence of methodological tools for simulation is unfortunately still to be seen in the communities of highperformance computing (HPC), parallel computing (encompassing Grid Computing), and of distributed computing (encompassing Peer-to-Peer research). Since no standard tool exists, most people build their own ad-hoc solutions. This approach was certainly reasonable in the past, e.g., for simple and deterministic platforms (such as supercomputers) or when timing realism was not mandatory (as in most works in distributed computing). However, this appraoch becomes hardly justiable for todays systems. Indeed, these systems exhibit very complex behaviors because, for example, of their heterogeneity (both quantitative and qualitative), and of their scale. Likewise, as the amount of data exchanged in distributed algorithms increases, it becomes mandatory to obtain simulation timings that match real-world timings accurately. While the Grid computing [8] has has become a widespread mean of enabling application developers to aggregate resources scattered around the globe for solving large scale problems, developing applications that can effectively utilize

the Grid still remains very difcult due to the lack of high level tools to support developers. Indeed, this is the case of simulation experiments, which still lacks tools for preparing simulation, sweep parameters, tune performance and to be enough generic in order to be used by multiple collaborations in addition to the use of computational Grids. Performing simulations can have two main purposes: (a) simulation of a single specic execution, or (b) simulation of many executions. The goal of the rst case is to study the behavior of a system under a set of specic congurations such as xed initial state, xed topology. In the second case, the aim is to study the system as different parameters or models vary. In the latter case, simulations must be repeated to allow for the analysis of the impact of several parameters, and to draw relevant conclusions. The classical parameters include the size of the system, and parameters that guide the structure of the platforms topology. Executing many simulations requires that one be able to control a simulation software library, such as SimGrid [5] or DASOR [14]. To be efcient, and to reduce the overall execution time, the best way is to be able to execute several simulations at the same time. This can be achieved by performing simulation on a parallel platform, or on a grid platform using middleware infrastructures such as Globus [7], XtremWeb [3], DIET [4] or CONFIIT [6]. Simulation executions are then distributed across several processors. Issues of simulation results storage and of global simulation state must be handled appropriately and in a distributed way. Traditionally the scale and the scope of an experimental study is determined by the ability of the research team to generate data. However, there is an increase in the ability to generate data, and now the scale and the scope of an experimental study have been determined by the ability of the team to manage data rather than generate it. In order to efciently handle large-scale simulations over a grid or a cloud, we rely on Experiment Management System (EMS). There have been a number of efforts to identify the requirements of an EMS, although none have been comprehensive, nor have they discussed the impact of the Web as a delivery technology. The ZOO group [10] lists three main requirements for an EMS: Uniform Interface - scientists should be provided a consistent user interface. Transparently manage data - users shouldnt have to deal with tedious data management issues.

Hide details of underlying software - the system should buffer the user from the details and lower tasks of the software.

Depending on the domain/community specicities, an EMS may be implemented in several ways. Although our work concentrates on simulation campaign management, we wish to make it as generic as possible in order to allow its maintenance and adaptation through the coming years. For instance, we rely on a Service Oriented Architecture (SOA) driven by a scientic workow description as a way to improve the traditional Grid systems, which are rather monolithic, characterized by a rigid and simplistic structure. In the case of SOA the underlying software system is offered entirely by services. Additional features of SOA like interoperability, self-containment of services and loose coupling bring more value to an integrated Grid-SOA solution. Initiatives towards the integration of SOA and Grid exist in Grid-related project. The Open Grid Forum (OGF) together with OASIS, as well as other organizations and standards bodies are developing the Open Grid Services Architecture (OGSA, see [18], [16], [13]), and the Open Grid Services Infrastructure (OGSI, see [17]), where GRID service instances are expressed as WSDL interfaces. In the other hand, Cloud computing is a new paradigm that is gaining more and more interest in the community due to the offer of virtualization, resources on demand provisioning, usage optimization and pay-per-use model. Based on the service oriented architecture, it is possible to combine multiple grid services that are provided by a tiers and hence to build more new and complex services. This is then built around virtualization to form a cloud. Nowadays there is a categorization of cloud computing services into the layers Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service(SaaS). We are particularly interested in presenting grid middleware as a service for experiment management in a kind of PaaS-type framework. This framework intend to offer grid computing and storage resources as services for the management of simulations campaign. For instance, we illustrate this framework through the implementation of services on top of the C ONFIIT (Computation Over Network with Finite number of Independent and Irregular Tasks) computing middleware. C ONFIIT [?], is a purely decentralized computing middleware, designed to cope with the deployment of parallel/distributed applications when faults and resource volatility are omnipresent. Our contribution extends this middleware with a service oriented approach, to provide C ONFIIT as a full-adapted service for experiment deployment on a dynamic grid environment. II. R ELATED W ORKS In spite of its importance, most EMS are specically tailored for a given platform or experiment. This situation leads to a non-negligeable lack of standards among the community, which translates to a frequent rewriting of similar tools. Today, a reduced number of generic platforms have been described in

the literature, but the spread of SOA and scientic workow tends to improve this offer. Among the platforms we studied, ZOO [10] was designed to be a generic Desktop Experiment Management Environment (DEME). ZOO philosophys stands that a generic experiment management platform will handle most of the common experiment management but it must it must be enhanced with some custom-made pieces, which can be generated usually with some effort (becoming a complete Customized Desktop Experiment Management System - CDEMS). Indeed, several projects rely on ZOO as their underlying management platform, customized to their own needs. Among these projects we can cite the National Magnetic Resonance Consortium 1 or soil science experiments on the Atmosphere-Land Exchange (Soil Science Department at the University of Wisconsin). Another studied platform is ZENTURIO [12], which is a fully integrated framework for experiment management for scientists. The entry point for a user is a graphical User Portal which normally resides on the local client machine (e.g. laptop). Through the portal, the user creates or loads a ZEN application annotated with ZEN directives that specify value ranges for any problem, system, or machine parameter, including program variables, le names, compiler options, target machines, machine sizes, scheduling strategies, or data distributions.The user utilize the ZEN performance directives to indicate the performance metrics to be measured and computed for each experiment. The functionality of the ZENTURIO experiment management tool is restricted to postmortem multi-experimental performance analysis and parameter sweeps. The ZENTURIO framework seems to be very heavy to use and needs that the scientist get get some efforts to write its ZEN application so that needs a knowledge about the ZEN language. Concerning grid computing, dozens of academic or industrial projects were developed. Today, several middlewares offer complete services (grid construction and management, task sharing management and results gathering) to deploy an application over a grid. In this last group we can cite Globus [7], XtremWeb [3], DIET [4] or ProActive [1]. Today, many of these works are trying to propose PaaS solution to allow utilization and the virtualization of computation resources. Indeed, scientic clouds are beginning to gain more and more interests as they combine the well known traditional Grid middleware and resource virtualization. Maybe one of the most important work in this context is The UNICORE [15] OMII-Europe also provides an OGSA-BES inter-face for Globus Toolkits that can be seen as a Web service frontend to the Grid Resource Allocation Manager (GRAM) [2]of Globus. Finally, many other commercial vendors (e.g. Microsoft, Altair Engineering, Platform Computing, HP) have adopted the OGSA-BES specication while UNICORE is used by Fujitsu, but is provided as an ready-to- use open source implementation on sourceforge [19] under BSD license Efforts has been done in the eld of Experiment management
1 http://nmr.magnet.fsu.edu/resources/nmrc/appendix1.html

in general. Some of them are using the workow paradigm to build an integrated environment for the experiment. In the same context of providing resources via cloud services the Clouds@Home project [9] aims to provide a guaranteed performance for computing and storage over unreliable Internet volunteered resources. For that the approach used is to integrate techniques for resource availability prediction, virtual machines and peer to peer data management. III. S IMULATION M ANAGEMENT A RCHITECTURE The framework we develop for simulation management is a collaboration computing environment which enables

the formation of the simulation campaign community of the Grid, resources and users as a virtual organization, the integration of simulation models and simulators provided by different development communities and Web portal-based interaction between various experiments allowing aggregation and advanced visualization.

A Grid-based architecture for the realization of this environment for simulation campaign is shown in the gure 1.
+

Blog

User Portal Home | Login + | - isa new Register | Contact User short Forun
documenta ry

Information Extraction and Analysis


XML XML Parameter set

Simulators: SimGrid DASOR GridSim

Simulation Generation (compilation)

Results

Execution Engine executables Grid Middleware

CONFIIT

DIET

PBS GRID

APST

CIGRI

Fig. 1: The Simulation Campaign Management Architecture In this section we are presenting our architecture for the management of the simulation campaign. Our tool architecture is designed on top of the Web services and SOA technology. 1) The User Portal: The user interacts with the tool for constructing and conducting simulations and parameter studies through one User Portal consisting of panels that export to the user the full functionality using a graphical and intuitive interface. The User Portal is designed as a small light-weight program easy to install and manage on the local machine which shields the end-users from the complexity from the underlying Grid environment. The User Portal may be also presented as a Web page that offers its functionalities through a Web browser. The user can express its needs depending on its prole through:

specic or generic language: that means the user must write a little code to say which simulation and parameter sweeps he does need; scripts writing: scripts are lighter than language integration due to the use of scripts in operating systems and the Grid middleware; composing the simulation and its workows in a semantic based manner within a SOA-based environment. The User Portal has these functionalities: Grid User Authentication: through this portal the user can be authenticated using a service of authentication which is connected to the environment of the Grid in order to prepare and submit jobs of simulations. This task is responsible for several deployment tasks such as proxy management, secure authentication, encrypted communication, and credential delegation to the Grid middleware services. It will be connected to the job manager of the grid to be able to submit it to the Grid with the sufcient authentication information. Campaign History: the environment provides access to the results of previous simulation campaigns as well as for post-mortem analysis of data. One of the elements tied to the user interface is the Abstract Simulation Composition mechanism. Indeed, the user describes the simulation campaign in the same manner as scientic workow orchestration [?]. A language for describing workows has to be used to allow the composition of such workow. There are many composition languages in the literature and a variety of them are based on the Petri Nets formalism [?] or on other business standards like the BPEL orchestration languages [?]. For the needs of candidates services for the composition, the tool contacts an ontological registry in order to nd the suitable service operations that will choose the simulator and the parameter sweep. It will then let the simulation preparation service generate the executable binaries/orchestrations for the execution engine. 2) Simulation Preparation: The preparation of the simulation consists on getting the simulator required by the user and to apply the parameters set that the user provided. This step is a service related to the User Portal, where the user indicates the parameters of its experiments. The simulator can be presented as a binary executable form; this way, the preparation of the simulation consists on associating the simulator to the parameter set as described by the simulator provider. In the case where the simulator is a source code and the parameter set is to be injected in the source code, the simulation preparation must inject the necessary source code and then compile it to obtain an executable simulator with the desired parameter set. Repository of Simulators The repository of simulators contains a variety of simulators in binary form or in source code form with a description about how to use it, input parameters and output format. This is done by the wrapping of simulators into tarball archive les with the metadata

information describing them. XML based documents are used to represent the simulator and its parameters. Parameter Study The parameter study is a set of parameters for each simulation that are provided through XML documents. Executable Simulations The result of the simulation preparation are the ready-to-execute simulations. Once they are generated by either compiling or by batching, they have the adequate format to be deployed/orchestrated into a Grid infrastructure. The deployment and the execution management is done by the Execution Engine. For that its important that the executable simulations t the format required by the execution engine. 3) Simulation (Experiment) Execution Engine: The execution engine translates and orchestrates the instructions of executions of the simulation campaign which were generated by the simulation preparation. The User Portal interacts with the execution engine by abstractly composing the required simulation campaign. Therefore, the execution engine can check on the output of the simulation preparation to rene the composition and execute it dynamically. In fact, after the composition of the abstract simulation campaign given by the user through the User Portal, the next step is to map the abstract nodes onto matching services. This is done iteratively during the processing of the workow. The instantiation of the simulations generated previously by the simulation preparation is done by the execution engine which will deploy the instances on the Grid using the Grid resources services. As a consequence, the simulation engine controls the conguration of the execution environment. For that, its important to have an event based control system which will be able to capture system events and to handle them automatically without the human interaction but always satisfying the user needs. One more important issue in the engine is to be able to handle exceptions and start/stop conditions in the simulation campaign. Furthermore, the engine must be able to test the good behavior and performance of the simulation campaign. All these information about the environment will be gathered through the underlying resource infrastructure. IV. S ERVICES IN CONFIIT A. Overview of CONFIIT The notion of FIIT applications was dened in [11], and they are composed by a Finite number of Independent and Irregular Tasks (FIIT). To develop in such a framework, the programmer only need to decide on how to divide the problem into a nite number of independent tasks, and how to compute each individual task. Typical simulation parameter-sweeping experiments may be considered as FIIT applications, as each task has a given parameters set and can be executed independently from each other. C ONFIIT (Computation Over Network for FIIT) [6], is a middleware for distributed computing that

launcher

Fig. 2: Centralized Mode

provides a rich API to help the developers to create their FIIT applications while providing a clear and reliable environment for users that wish to deploy their applications over a cluster or a grid2 . Each computer collaborating in a C ONFIIT computation is called a node. A node is set up with three main basic thread components: a topology and communication manager, a tasks manager and one or several task solvers. The nodes are connected according to a logical oriented ring overlay network, set up and maintained by the topological layer of the system. Task status are exchanged to broadcast local knowledge on all nodes, and thus, to compute an accurate global view of the calculus. At the end of the computation, the ring spreads termination on nodes. Due to the nature of FIIT applications, a node is able to locally decide which tasks still need to be computed, and can carry the work autonomously if no other node can be contacted. If later a node reintegrates a community, it is able to share the results from the tasks it completed and resynchronize its tasks list. 1) Programming models: Since constraints of a given application could be different and sometimes in contradiction (fault tolerance, efciency, etc.), C ONFIIT offers two main programming models: distributed and centralized mode. The distributed mode allows an accurate fault tolerance in the computation since task results are locally stored on each node in the community. Thus, a broken computation can be re-launched using already computed tasks. Fig. 2 shows information exchanges in the community for a distributed application. At rst, the launcher sends the computing request to a node. The request is propagated along the community by the token . During computation, results of individual tasks are propagated across the community such that each node could locally store all individual results (data blocks). Concurrently to the computations, information on the global computation is exchanged. Another interesting point from this mode is that the launcher only needs to be connected during the initiation phase. At the end of the computation, the global result can be retrieved from any node in the community. The centralized mode reduces the global load of storage
2 C ONFIIT can be reims.fr/lsteffenel/CONFIIT

dowloaded

at

http://cosy.univ-

launcher

receiver

Fig. 3: Distributed Mode

space and network communication, with the drawback of reducing fault tolerance. This mode is especially suited for the integration of external applications, which manage themselves the data storage. Fig. 3 shows information exchanges in the community for a centralized application. At rst, the launcher sends the computing request to a node. The request is propagated along the community as in the distributed mode, but the launcher must remain connected. During computation, results of individual tasks are sent to the initial launcher, which has the storage in charge (data blocks). As in the distributed mode, information on the global computation evolution is updated through the token. The crash of a computing node will only impact its current tasks, as the remaining nodes will recompose the ring topology to continue the computing. In the case of a crash on the launcher, partial results will be blocked on the computing nodes until the launcher is restored. Contrarily to the distributed mode, however, a computing node alone cannot restore the last known checkpoint as it does not stores the partial results from the other nodes. B. Service Model for C ONFIIT The emergence of service oriented architecture have promoted the reusability of existing components and information resources to assemble these components in a exible manner. That has led to the adoption of this service oriented model in development of the Grid. The key idea of the solution is to allow the exible assembly of grid resources by exposing the functionality through standard interfaces with agreed interpretation. This facilitates the easy deployment of grid systems on all scales. OGSA has chosen the Web service infrastructure and framework, this means that OGSA systems and applications are structured respectively with service oriented architecture principles. This choice has been made based on the effectiveness of the Web service architecture, and as Web service technology is widely adopted and industry-standard. The main aim of OGSA is to determine where existing work does not meet the requirements of the Grid, rather than invite a competing standards. One of the most signicant extensions to the Web services made by OGSA is adding the statfulness and the transient-ness to the Web services standard which results

in dening the Grid Service As we have shown in the previous section that C ONFIIT can be used as a middleware for the Grid. In the other hand, it lakes to a model for developing service oriented grid applications. For this purpose, the denition of a new integrated service model in the C ONFIIT architecture is needed. As OGSA is becoming standard service model for the Grid, we choose to develop C ONFIIT middleware to be compliant to the OGSA specications. 1) OGSA Based Model: Resources in OGSA are represented by services, these services are called the Grid services. Grid service is an extension for the Web service and described using WSDL language with extensions to comply with Grid requirements. This WSDL extension is called GSDL (Grid Service Description Language). A system that is OGSA compliant is built by assembling Grid services. In order to be a Grid service, a component should extend the GridService interface (a portType in in WSDL) [?]. Web services in their standard form are not dened to meet all the Grid requirements, thats why the need for modication or extension to the existing Web service standard has been emerged. Now, the OGSA architecture has been involved in the denition of WSDL 2.0. Service interfaces are assumed to be dened using a WSDLbased language extension. XML is used as the lingua franca for description and specication, and using SOAP as the primary protocol for OGSA services. OGSA-naming, which is the naming scheme for OGSA, uses three level naming convention. each named OGSA entity is associated with a human-oriented name (optional), an abstract name, and an address. One of the key points in OGSA is that its capabilities do not need to be presented entirely in the system;i.e. we could build the system using only subset of capabilities dened by the OGSA. OGSA actually, represents the services, their interfaces and the semantics of these services but has nothing to do with their implementations and the internals. Execution Management Service (EMS) in OGSA is concerned about the instantiating, managing, completion units of work, which may include either OGSA applications or nonOGSA applications (like database server, servelet running in a container, and other such legacy applications). 2) C ONFIIT Service: The C ONFIIT is considered to be a computational resource allow to fasten the execution of FIIT problems on network in peer-to-peer model. This assumption led that our developed middleware will make use of only a subset of the specications dened by OGSA which is the execution management services. Challenges to be addressed in C ONFIIT Service Oriented Model

Service Semantics Service Description Naming and Service Discovery Service Container

Firstly we need to dene some expression regarding the C ONFIIT middleware. a C ONFIIT cluster is the set of nodes

Service Service launcher

Service launcher

Service

Service

receiver

Service

Fig. 5: Services for C ONFIIT in centralized mode

Fig. 4: Services for C ONFIIT in distributed mode

Describe Service

Initialize ConitCommunity

on which the C ONFIIT daemon is running and are working together to solve a FIIT problem using C ONFIIT middleware. The service model for C ONFIIT is taking into consideration that it works in one of two model of computations, distributed and centralized. The existence of these two models of computations relevant led to think in two ways for the C ONFIIT service model. First is to represent every node in the C ONFIIT cluster as a service (a C ONFIIT service in our developed middleware), since each node is considered as a computational resource. This services should communicate with each other and orchestrated to do the whole job done by the C ONFIIT cluster. The gure 4 illustrates the rst case where the model of C ONFIIT computation used is the distributed mode. Second is to represent the whole C ONFIIT cluster as one service which is responsible of managing jobs executed over the nodes in the C ONFIIT Cluster. This second case does present C ONFIIT capabilities as services ,illustrated by the gure 5, so we can make use of the whole cluster to deploy applications on the C ONFIIT cluster. This is done with handling just the C ONFIIT service which in turn will get the responsibilities of resources in the C ONFIIT cluster which include the nodes on which C ONFIIT middleware will run and the application will divided amongst 3) Implementation of the C ONFIIT Service: The implementation will count on the C ONFIIT middleware as the under layer for job execution management. Since C ONFIIT middleware cope with the stuff regarding the network resource management and the computation over the network, as well as the fault tolerance. The C ONFIIT service will be a representation for the C ON FIIT community which is described above, so the service will describe the nodes on which the application could be run. Main interfaces in C ONFIIT service will be the ConfiitService interface which denes methods for service manipulation (similar to those dened in OGSA GridService interface), and methods for managing the C ONFIIT community. The methods

CSExtension Ressource Description

Conit Middleware

Client App

Declare Service Data

Instantiate Service / Service Factory

Actual Allocation Deployment Ressource Allocation Check Availability Dene Container

CS App

Ressource

Service Naming

Service Discovery Service Client

Resolve Service Naming

Fig. 6: Use Case diagram for the C ONFIIT services

regarding to the C ONFIIT community take care about handling C ONFIIT middleware and resources for C ONFIIT community. Service data accompanied with the C ONFIIT service is the data about the inner state of the service and is divided into two categories: Service information, which represents the state of the service and the result done by the application running on the C ONFIIT community or initial data provided to the application. Service meta-data has information about the service itself, and in C ONFIIT Service it contains information about the C ONFIIT community such as the resources reserved for the service, the service timeout, and cost of resources usage. The Service data is available for usage through an interface; ConfiitSD How to use the C ONFIIT service model to design service oriented applications as illustrated in the gure 7: Firstly, any component that needs to be a C ONFIIT service

CONFIITService Interface Service Interface CONFIIT Interface

Service Data Interface

ex

ten

ds

Service Factory

SampleService

Sample Service Factory

Fig. 7: Services In C ONFIIT

consider the scenario a client that which needs to submit a job to a C ONFIIT service so that this service will be able to allocate grid resources like nodes and computation resources. The user has a web portal by which it can submit a Java jar le to the portal and to decide how many nodes it needs to be used to execute the submitted job. The C ONFIIT platform creates an instance from the C ON FIIT service factory. The instance then creates the C ON FIIT community which is then managed by C ONFIIT which is presented in the section IV-A. This community is created and managed using the centralized mode of C ONFIIT. V. C ONCLUDING REMARKS AND FUTURE WORK

Sample Service Factory instantiate Client1

Client2

Sample Service Instance

Client3

Fig. 8: Services In C ONFIIT should extend an interface which is the CONFIITService interface. the service as we have mentioned before does describe the C ONFIIT community, this impose another points to be taken into account which is the reserving of the nodes that will participate in the C ONFIIT community. In the rst, the component that would act as a C ONFIIT service will dene the default number of the nodes that are involved in the C ONFIIT community at the rst during the description phase where we just describe the service. the real reservation for the nodes will be done in the next phases where the service will be instantiated. For the time being we have supposed that the reservation is done by its simplest way where we got the nodes with no other considerations about administrative issues. We put an interface for reservation to be developed in the future by the ReservationService, which will be the interface for node reservation. the implementation of the ReservationService and the reservation and management of the nodes contribute in the community is out of the scope of this paper and will be discussed later in other contributions. after the component, which is considered to be a Java class here, extends the C ONFIIT service interface. The Service factory in its turn is responsible of creating new service instances dedicated to the service description. In service instantiating phase, the community is set up;i.e. C ONFIIT middleware will run on the nodes and be available to adapt jobs in C ONFIIT like form. 4) C ONFIIT as a Service: Providing the middleware C ON FIIT as a service to be used by users and developers of distributed computing applications is not a trivial work. Let

C ONFIIT proposes a framework to easily distribute computations associated to a given application throughout widearea networks. Its main characteristic, and benet, is a fully asynchronous distributed environment. Through the extension of the FIIT interface, applications can easily benet from C ONFIIT properties such as communication efciency and fault tolerance. The utilization of the C ONFIIT service promotes more reusability and the virtualizing of resources of the Grid environment Users do not have to worry about underlying management such as load balancing or logical structure: the system manages it on its own. The model is based on OGSA specication take into consideration grid environment and applications. It take into account resources of computing and storage for the grid. C ONFIIT service make a good PaaS to offer its underlying middleware as a service to create, deploy and run a community of computation resources nodes. We investigate to implement in the future a service in the PaaS level to manage clusters and orchestrate the different C ONFIIT services which are a set of communities. Each service is running a community of C ONFIIT middleware. This can be also envisaged by creating an on-demand cloud for the IaaS level when we decide the number of nodes, the network topology and the logical communities to run jobs. ACKNOWLEDGEMENT This work was supported by ANR (Agence National de Recherche - France), project reference ANR USS-SimGrid (ANR 08 SEGI 022)3 . R EFERENCES
[1] Laurent Baduel, Francoise Baude, Denis Caromel, Arnaud Contes, Fabrice Huet, Matthieu Morel, and Romain Quilici. Grid Computing: Software Environments and Tools, chapter Programming, Deploying, Composing, for the Grid. Springer-Verlag, January 2006. [2] Athman Bouguettaya, Ingolf Kr ger, and Tiziana Margaria, editors. u Service-Oriented Computing - ICSOC 2008, 6th International Conference, Sydney, Australia, December 1-5, 2008. Proceedings, volume 5364 of Lecture Notes in Computer Science, 2008. [3] Franck Cappello and Samir Djilali. Computing on large-scale distributed systems: Xtrem Web architecture, programming models, security, tests and convergence with grid. Future Gener. Comput. Syst., 21(3):417437, March 2005. [4] Eddy Caron and Fr d ric Desprez. DIET: A scalable toolbox to build e e network enabled servers on the grid. International Journal of High Performance Computing Applications, 20(3):335352, 2006.
3 http://uss-simgrid.gforge.inria.fr/

[5] Henri Casanova, Arnaud Legrand, and Martin Quinson. Simgrid: A generic framework for large-scale distributed experiments. In Computer Modeling and Simulation, 2008. UKSIM 2008. Tenth International Conference on, pages 126 131, april 2008. [6] O. Flauzac, M. Krajecki, and L.A. Steffenel. Conit: a middleware for peer-to-peer computing. Journal of Supercomputing. [7] Ian Foster and Carl Kesselman. Globus: A metacomputing infrastructure toolkit. International Journal of Supercomputer Applications, 11:115 128, 1996. [8] Ian Foster and Carl Kesselman, editors. The grid: blueprint for a new computing infrastructure. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1999. [9] Clouds @ Home. Clouds @ home. http://clouds.gforge.inria.fr. [10] Yannis E. Ioannidis, Miron Livny, Anastassia Ailamaki, Anand Narayanan, and Andrew Therber. Zoo: a desktop experiment management environment. SIGMOD Rec., 26:580583, June 1997. [11] Micha l Krajecki. An object oriented environment to manage the pare allelism of the it applications. In Proceedings of the 5th International Conference on Parallel Computing Technologies, PaCT 999, pages 229 234, London, UK, 1999. Springer-Verlag. [12] Radu Prodan and Thomas Fahringer. Zenturio: An experiment management system for cluster and grid computing. In In Proceedings of the 4th International Conference on Cluster Computing (CLUSTER 2002, pages 918. IEEE Computer Society Press. http://www.par.univie.ac.at/project/zenturio, 2002. [13] Radu Prodan and Thomas Fahringer. From web services to ogsa: Experiences in implementing an ogsa-based grid application. grid, 00:2, 2003. [14] C. Rabat. Dasor, a Discret Events Simulation Library for Grid and Peer-to-peer Simulators. Studia Informatica Universalis, 7(1), 2009. [15] Bernd Schuller, Bastian Demuth, Hartmut Mix, Katharina Rasch, Mathilde Romberg, Sulev Sild andSteffenel10b Uko Maran, Piotr Bala, Enrico del Grosso, Mos Casalegno, Nad` ge Piclin, Marco Pintore, e e Wibke Sudholt, and Kim Baldridge. Chemomentum - unicore 6 based infrastructure for complex applications in science and technology. In Luc Boug , Martti Forsell, Jesper Larsson Tr ff, Achim Streit, Wolfgang e a Ziegler, Michael Alexander, and Stephen Childs, editors, Euro-Par Workshops, volume 4854 of Lecture Notes in Computer Science, pages 8293. Springer, 2007. [16] Domenico Talia. The open grid services architecture: Where the grid meets the web. IEEE Internet Computing, 06(6):6771, 2002. [17] S. Tuecke, K. Czajkowski, I. Foster, J. Frey, S. Graham, C. Kesselman, D. Snelling, and P. Vanderbilt (eds.). Open grid services infrastructure (ogsi). [18] S. Tuecke, K. Czajkowski, I. Foster, J. rey, F. Steve, and G. Carl. Grid service specication, 2002. [19] unicore. Unicore. http://www.unicore.eu.

You might also like