You are on page 1of 29

J Supercomput (2015) 71:15051533

DOI 10.1007/s11227-014-1376-6

Efficient task scheduling algorithms for heterogeneous


multi-cloud environment
Sanjaya K. Panda Prasanta K. Jana

Published online: 22 January 2015


Springer Science+Business Media New York 2015

Abstract Cloud Computing has grown exponentially in the business and research
community over the last few years. It is now an emerging field and becomes more
popular due to recent advances in virtualization technology. In Cloud Computing,
various applications are submitted to the datacenters to obtain some services on payper-use basis. However, due to limited resources, some workloads are transferred to
other data centers to handle peak client demands. Therefore, scheduling workloads
in heterogeneous multi-cloud environment is a hot topic and very challenging due
to heterogeneity of the cloud resources with varying capacities and functionalities.
In this paper, we present three task scheduling algorithms, called MCC, MEMAX
and CMMN for heterogeneous multi-cloud environment, which aim to minimize the
makespan and maximize the average cloud utilization. The proposed MCC algorithm
is a single-phase scheduling whereas rests are two-phase scheduling. We perform
rigorous experiments on the proposed algorithms using various benchmark as well
as synthetic datasets. Their performances are evaluated in terms of makespan and
average cloud utilization and experimental results are compared with that of existing
single-phase and two-phase scheduling algorithms to demonstrate the efficacy of the
proposed algorithms.
Keywords Cloud computing Multi-cloud environment Task scheduling
Makespan Cloud utilization

S. K. Panda
Department of Computer Science and Engineering, Veer Surendra Sai
University of Technology, Burla 768018, India
e-mail: sanjayauce@gmail.com
P. K. Jana (B)
Indian School of Mines, Dhanbad 826004, India
e-mail: prasantajana@yahoo.com

123

1506

S. K. Panda, P. K. Jana

1 Introduction
Recent advances of virtualization technology have made Cloud Computing [13] as
an emerging computing paradigm that offers dynamic provisioning of computing services on pay-per-use basis [4]. The services are offered in the form of infrastructure, platform and software (applications) which are referred as IaaS (Infrastructure
as a Service), Paas (Platform as a Service) and SaaS (Software as a Service) in IT
industry [5,6]. Cloud Computing has gained enormous popularity in both business
and scientific communities because of its cost effectiveness, reliability and scalability [1,5]. In addition to these, the users do not require any investment for procuring any new infrastructure. They can obtain their demanded services from anywhere
in the world just by payment and without worrying about the complexity of the IT
infrastructure.
An application is a collection of multiple tasks which is represented by a Directed
Acyclic Graph (DAG). Independent tasks in a DAG can be executed concurrently
by multiple virtual machines (VMs) while related tasks need to be executed in the
correct sequential order predefined by precedence among the tasks. Scheduling the
tasks for their execution with minimum makespan (i.e., overall execution time of
all the tasks) is a NP-complete problem [715]. Therefore, various studies [8,10
30] have been made to obtain near optimal solution for Cloud Computing. Most of
these works have been carried out in a unified environment, i.e., on a single cloud.
However, there is no data center that has unlimited resources. Therefore, there is a
need to distribute workloads among multiple clouds. Consequently, the main focus of
research in Cloud Computing in the recent years, has been how to schedule all the tasks
of the applications across multiple clouds. Particularly, this is a challenging issue in
the federated heterogeneous multi-cloud environment. To the best of our knowledge,
cloud list scheduling (CLS) and cloud MinMin scheduling (CMMS) [16] are the first
two works which are applicable to federated heterogeneous multi-cloud systems.
In this paper, we address the following task scheduling problem in federated heterogeneous multi-cloud environment. Given a set of n applications along with their
arrival time and mode of execution and given a set of m clouds along with execution
time of the tasks on different clouds, the goal is to schedule all the tasks to the clouds
so that they can be executed with minimum makespan. We present here three algorithms called Minimum Completion Cloud (MCC), MEdian MAX (MEMAX) and
Cloud MinMax Normalization (CMMN) toward the solution of this problem. MCC
is a single-phase scheduling whereas MEMAX and CMMN are two-phase scheduling algorithms. Single-phase scheduling is applicable for online scheduling. We show
that while CMMN performs better in makespan, the MEMAX achieves better average
cloud utilization. We perform extensive experiments on the proposed algorithms using
various synthetic as well as benchmark datasets [31] and measure the performance
using various metrics such as makespan and average cloud utilization. The experimental results show that the proposed algorithms outperform two standard algorithms, i.e.,
MinMin and MaxMin algorithm [10] in terms of these performance metrics. The
algorithms also perform better than the algorithm presented in [16]. Our major contributions can be summarized as follows.

123

Efficient task scheduling algorithms

1507

Development of three task scheduling algorithms for heterogeneous multi-cloud


environment.
Simulation of the proposed algorithms with synthetic as well as benchmarks
datasets.
Performance evaluations of the proposed algorithms through performance metrics,
namely makespan and cloud utilization.
Comparison of experimental results with related existing algorithms to show effectiveness of the algorithms.
The remainder of this paper is organized as follows. Section 2 presents related work
with their pros and cons. Section 3 describes the model and formulates the scheduling
problem. Section 4 presents the proposed algorithms. Various performance metrics
are presented in Sect. 5 for evaluation of the proposed algorithms. Section 6 presents
the experimental results and their comparisons with the existing algorithms followed
by the conclusion in Sect. 7.
2 Related work
Many task scheduling algorithms have been developed for various computing platforms such as cluster, grid, parallel and distributed processing [3234]. Some of these
algorithms are extended to Cloud Computing [10,16,17]. However, they fail to fulfill
the requirements of Cloud Computing with respect to their cost effectiveness, reliability, scalability etc. Lawler et al. [35] have solved the preemptive scheduling on
unrelated parallel processors using linear programming. In their work, it is explicitly
shown that optimal scheduling of n jobs on m processors require no more than O(m 2 )
preemptions. Liu et al. [36] have presented a priority rule-based heuristic serial schedule algorithm for unrelated parallel machine scheduling with arbitrary precedence
constraints. The priority rule can be selected from the arithmetic mean and deviation
of the processing times to schedule the prior job-machine pair. Kumar et al. [37] have
presented polylogarithmic approximations for unrelated parallel machines under treelike precedence constraints. The first step in their algorithm includes the method of
Lenstra et al. [38] to get the processor assignment followed by the generalized randomdelay technique of Leighton et al. [39] to the trees. Smith et al. [40] have proposed
various scheduling algorithms for advance reservation of resources based on first come
first serve (FCFS) and least work first (LWF) queues using conservative backfilling.
But their algorithms result in long waiting time for execution of the applications. The
algorithm proposed by Sotomayor et al. [22] requires the user to request resources in
the form of leases. A lease is submitted either in advance reservation (AR) or best effort
(BE) mode. These leases are incorporated in Haizea VM-based lease manager [41].
However, a lease can be accepted or rejected by cloud service provider based on the
availability of the resources but this results in poor lease acceptance ratio. Therefore,
Akhani et al. [24] have introduced a negotiation policy between the cloud provider
and customer. It improves the lease acceptance ratio.
The expected time to compute (ETC) matrix estimates the execution time of the
tasks on each cloud. The expected value is based on the task characteristics such
as set of instructions, data size and cloud characteristics like processing speed and

123

1508

S. K. Panda, P. K. Jana

bandwidth. It may happen that some expected values are much farther than the others.
These values are called inconsistent elements which increases the scheduling length.
To identify the inconsistent elements, Ergu et al. [20] have used an induced bias matrix.
It improves consistency ratio which is calculated using eigen vector and eigen value.
However, it is not a feasible approach if the matrix size is very big. Rai et al. [21]
have presented the resource allocation problem in the form of bin-packing problem.
Wang et al. [10] have introduced a two-phase scheduling algorithm for load balancing
in a three-level Cloud Computing network. The first phase uses opportunistic load
balancing (OLB) and the second phase uses load balance MinMin (LBMM) to assign
a task to a service node. However, this algorithm uses a threshold to estimate the load
before task scheduling takes place. The round-robin algorithm is used in the GoGrid
[42] cloud system. The algorithm schedules the tasks to the clouds evenly i.e., one
after another despite of the heterogeneous loads across clouds. Wen et al. [19] have
proposed a load balancing approach which is applicable for cloud-based multimedia
system. However, the utilization of resource is still ineffective. Sotomayor et al. [23]
presents a model to decrease overhead. Several meta-heuristic approaches [4348]
are applied to task scheduling and resource allocation which provide near optimal
solutions. Hou et al. [43] have presented a method based on genetic algorithm (GA)
for multiprocessor scheduling problem. The method is applicable to any random task
graph but considered only one task graph at a time for task scheduling. Yang et al.
[44] have proposed a prediction method based on evolutionary algorithm to handle
the fluctuation of the load. Nathani et al. [28] have presented the resource allocation
policy using Haizea VM-based lease manager [41]. The policy deals with a mode called
deadline sensitive (DS) which is supported by Haizea. It assumes that the deadline
sensitive tasks are preemptable. Also, the BE mode assigns the demanded resources as
per available basis, otherwise it is placed in a queue. However, most of the literatures
presented above are not suitable for heterogeneous multi-cloud environment.
Recently, Li et al. [16] have presented two online scheduling algorithms called cloud
list scheduling (CLS) and cloud MinMin scheduling (CMMS) for task scheduling in
IaaS cloud. These scheduling incorporates the AR and BE modes which are presented
in terms of applications consisting of various tasks. The CLS chooses the minimum
execution time cloud without considering the cloud ready time (CRT). Therefore, it
leads to load imbalance problem. On the other hand, CMMS is quite similar to Min
Min algorithm of grid computing. Moreover, their algorithms are not experimented
using benchmark dataset. In this paper, we deal with the same task scheduling problem
as described in [16]. Although our proposed algorithms have same time complexity
as that of [16], they produce better makespan and also better average cloud utilization
on benchmark as well as synthetic datasets due to load balance scheduling. They also
outperform MinMin and MaxMin algorithms [10].
3 Models and problem statement
3.1 Cloud model
We assume that a number of data centers provide on demand resources and computing
facilities over internet in collaborative fashion. All computational resources are in the

123

Efficient task scheduling algorithms

1509

form of VMs deployed in the data centers and the VMs from different clouds are
of different types and characteristics. For example, they may have different numbers
of CPUs of different cycle times, memory capacity and network bandwidths. For
dispatching the services in a federated manner, we assume that every data center has a
manager server which knows the current status of VMs in its own cloud. When a cloud
receives an application from a client, it executes it in cooperation with other clouds
as follows. It communicates the manager server of other clouds to distribute the tasks
among them including it. The manager servers keep track of the resource availabilities
in other clouds as there is no centralized node in the model. On receiving an application,
the manager server of the cloud first partitions the application into several tasks. Then,
for each task it takes the decision which cloud will execute it based on the information
from the manager servers of all other clouds. When a manager server assigns the task
to its own cloud, it stores in a queue and executes the task as and when the data and
resources are ready. However, prior to executing the task the manager server transfers
the disk image to all computing nodes for execution of the tasks. We assume that
all required disk images are stored in the data center and can be transferred to any
clouds as needed. Note that each cloud can execute only one task assigned to it at
a time. However, several tasks can be executed on different clouds simultaneously
and the data transfer time for the results of predecessor tasks to another cloud is
ignored.
For allocation of resources, we assume two modes, AR and BE. The popular
resource lease manager Haizea [22,41] uses the AR mode and most of the current
cloud providers support BE mode. In AR mode, resources are reserved in advance
and they should be available at a specific time. In BE mode, resources are provided as
soon as possible and requests are stored in a queue. Execution of a task in AR mode
may preempt a BE task. If a BE task is preempted from a cloud, say Ci intermittently
due to arrival of an AR task, it must further be executed only on the cloud Ci immediately after the AR task is over. Note that the above model is similar to IaaS cloud
system as used in [16]. However, we describe here for the sake of completeness of our
paper.
3.2 Application model and problem statement
Consider a set of m clouds C = {C1 , C2 , C3 ,, Cm } and a set of n applications A =
{A1 , A2 , A3 ,, An } in which each application Ai is represented in the form of a DAG
D = (Ti , E i ) where Ti = {Ti1 , Ti2 , Ti3 ,, Ti pi } denotes a set of pi tasks (called nodes)
and E i denotes a set of links (called edges). An edge E i jk = (Ti j Tik ) E i
represents the dependency between the task Ti j and task Tik . Note that the total number
of tasks in an application Ai (i.e., |Ai |) need not be equal to the total number of tasks
in another application A j (i.e., |A j |). But all the applications are disjoint from each
other, i.e., {Ai } {A j } = , i, j, 1 i, j n. Each application Ai has different
arrival time and can be submitted to the cloud provider with two modes (AR or BE). A
task should not start unless all predecessor tasks are completed. Each task has different
execution time on different clouds. Let ETCi j ,k denotes the execution time of task j
of application i on cloud k. The ETC matrix is presented in Eq. 1. Given an ETC
matrix, the problem of task scheduling is to allocate all the tasks to the clouds such

123

1510

S. K. Panda, P. K. Jana

Customer 3

Customer 2

Customer 1

Application 1
Arrival Time: 0

Application 2
Arrival Time: 10

(a)

(b)

Application 3
Arrival Time: 15

(c)

Fig. 1 The DAG of three applications

that the tasks are executed fulfilling their precedence with the objective to minimize
the makespan of the entire set of applications.

T11

T12
A1
..

T1 p1
..
ET C =
.
Tn1

Tn2
An
..

Tnpn

C1

E
T
C11,1

E
T
C

12,1

..

E
T
C

1 p1 ,1

..
.

E
T
C

n1,1

E T Cn2,1

..

E T Cnpn ,1

C2
E T C11,2
E T C12,2
..
.

..
.

Cm
E T C11,m
E T C12,m
..
.

E T C1 p1 ,2 E T C1 p1 ,m
..
..
..
.
.
.
E T Cn1,2 E T Cn1,m
E T Cn2,2 E T Cn2,m
..
..
..
.
.
.
E T Cnpn ,2 E T Cnpn ,m

(1)

3.3 An illustration
Let us consider an example as shown in Fig. 1. There are three applications A1 = (T1 ,
E 1 ), A2 = (T2 , E 2 ) and A3 = (T3 , E 3 ) where A1 = (T1 = {A, B, C, D, E} and
E 1 = {AB, AC, AD, BE, CE, DE}), A2 = (T2 = {F, G, H , I , J } and E 2 = {FH, FI,
GH, GJ, HI, HJ}) and A3 = (T3 = {K , L, M, N , O, P} and E 3 = {KL, KM, LN, MP,
NO}) in which A2 has AR mode and A1 and A3 has BE. Assume that the arrival time
of the A1 , A2 and A3 is 0, 10 and 15 respectively. These applications are scheduled
to four different clouds. The ETC matrix representing execution time of each task on
each cloud is shown in Table 1. Note that we present the ETC matrix by interchanging
the rows (clouds) and columns (applications) in contrast to Eq. 1. Note that in each
cloud, the execution time of the task is different because we present this problem in
heterogeneous multi-cloud environment.

123

Efficient task scheduling algorithms

1511

4 Proposed algorithms
4.1 Minimum completion cloud scheduling
The Minimum Completion Cloud (MCC) scheduling finds the completion time of the
ready tasks on each cloud. The completion time of the task is the sum of the expected
computation time of the tasks and the cloud ready time which is the earliest time that
the cloud is going to be ready after completing the execution of all the assigned tasks.
At last, the tasks are assigned to a cloud that results minimum completion time. If there
is more than one ready tasks then we follow alphabetic order to break the ties. Given
the DAGs of Fig. 1 and the ETC matrix in Table 1 (refer Sect. 3), the corresponding
Gantt chart for four clouds is shown in Table 2 in which an * denotes idle time and
denotes the time span. The method is illustrated as follows. The entry task A takes
minimum completion time in cloud 3 which is 4 units of time. Therefore, task A is
assigned to cloud 3 first. The ready time for successor tasks B, C and D is 4 units of
time as task A requires 4 units of time to complete. Tasks B, C and D are now ready
for scheduling. The minimum completion time of these tasks are 2 + 4, 7 + 4 and 2
+ 4 respectively. Therefore, B or D having the same minimum completion time can
be assigned next. However, we assign task B to cloud 2 to follow alphabetic order.
Similarly, we assign other tasks to their minimum completion cloud respectively. Note
that task F preempts the task C from cloud 4 at 10 time unit as task F is an AR task. As
per our cloud model, task C must further be executed only on the cloud 4 immediately
after AR task F is completed. Similarly, task I preempts the task E from cloud 2 at
16 units of time as task I is an AR task. The cloud makespan is 36 units of time. The
average cloud execution time is 31.5 units of time which is calculated by averaging

Table 1 An ETC matrix with


16 tasks and 4 clouds

Tasks

A B C

E F G H I J

K L

Cloud 1 9 3

Cloud 2 5 2

9 10 3 8 6 2 2 18 3

M N

6 6 9 3 5 4 19 6 10 3

O P

4 3 9

8 2

3 3 8

Cloud 3 4 9 10

2 7 5 8 7 6 13 7

9 4

9 5 7

Cloud 4 8 4

6 8 4 9 8 5 16 8

8 2

10 4 6

Table 2 The scheduling sequence in MCC


Time

010

1013

1315

1521

2129

2933

Cloud 1

3336
O

Time

04

46

614

1416

1618

1821

2123

2331

3136

Cloud 2

Time

04

46

616

1629

2936

Cloud 3

Time

04

410

1014

1415

1521

2129

2936

Cloud 4

123

1512

S. K. Panda, P. K. Jana

Table 3 Task mapping in MCC


Tasks

Clouds

makespan of the individual cloud [i.e., (36 + 31 + 30 + 29)/4]. The average cloud
utilization is 0.875 which is calculated by dividing average cloud execution time by
makespan [i.e., (31.5 / 36)]. The corresponding task-cloud mapping is shown in Table
3. The main merit of MCC is that it overcomes the load imbalance problem as it assigns
the tasks depending on the completion time of the tasks.
4.1.1 Pseudo code for MCC
For the pseudo code of MCC, we use the following terminologies. The pseudo code
for MCC is shown in Fig. 2.
Notation

Definition

Queue of tasks

Q AR

Advance reservation queue

QBE

Best effort queue

QXX

AR or BE queue

QT

Temporary queue

|Q T |

Total number of tasks in the temporary queue

Delete (X )

Delete the first element of queue X

MC (ij, k)

Minimum completion of task j of application i on cloud k

ETC (ij, k)

Expected time to compute task j of application i on cloud k

CRT (k)

Ready time on cloud k

EST (ij)

Earliest start time of task j of application i

MCC algorithm uses four queues: Q, Q A R , Q B E and Q T . When a users request


(or task) arrives, the service provider places the request in Q in First Come First Serve
(FCFS) order. Then, the replica of the request is placed in either Q A R orQ B E . Note that
the queues contain the ready tasks of all submitted applications, i.e., a task is inserted
by the manager as soon as all predecessors have been scheduled. If Q A R is not empty, it
removes the first element of Q A R (say, temp) (Line 23). Then, it calculates the earliest
start time (EST) of temp (Line 4). EST is the time at which the task can start provided
all its predecessor tasks are completely executed. Then, it finds the completion time
in each cloud using the Equation shown in line 6 followed by selecting the task-cloud
pair which takes minimum completion time (Line 8). Note that the start time of temp
is the EST and the finish time of temp is minimum completion (MC) time. However,
the slot between the start and finish time may be occupied by BE tasks. Therefore, it
calls the Procedure PREEMPT-BE-TASKS (Line 9). If BE tasks are already scheduled,

123

Efficient task scheduling algorithms

1513

Algorithm: MCC

1. while Q NULL do
2.
if QAR NULL
3.
temp Delete(QAR)
4.
Find EST(temp)
5.
for k = 1, 2,, m
6.
MC(temp, k) = ETC(temp, k) + EST(temp)
7.
endfor
8.
Find the task-cloud pair (temp, l) that gives minimum MC(temp, l)
9.
Call PREEMPT-BE-TASKS(EST(temp), MC(temp, l), l)
10. else
11.
temp Delete(QBE)
12.
for k = 1, 2,, m
13.
MC(temp, k) = ETC(temp, k) + CRT(k)
14.
endfor
15.
Find the task-cloud pair (temp, l) that gives minimum MC(temp, l)
16. endif
17. Assign temp to cloud l
18. CRT(l) = CRT(l) + ETC(temp, l)
19. Call ADD-SUCCESSORS(temp)
20. Remove temp from Q
21. endwhile
Procedure 1: PREEMPT-BE-TASKS(x, y, z)

1. if BE tasks are scheduled on cloud z from x to y units of time


2. Preempt BE tasks and place it in QBE
3. Update CRT(z)
4. endif
5. Return
Procedure 2: ADD-SUCCESSORS(temp)
1. Add the successor(s) of temp in QT and Find |QT|
2. if QT NULL
3. for q = 1, 2, , |QT|
4.
temp Delete(QT)
5.
if all the tasks predecessor to temp is executed
6.
Add temp to Q and QXX
7.
endif
8. endfor
9. endif
10. Return
Fig. 2 Pseudo code for MCC algorithm

then the tasks are preempted and placed at the front of Q B E (Line 12 of Procedure
1). At last, the ready time is updated. If Q A R is empty then it selects the BE task
(Line 1011). Like AR task, it finds the minimum completion in each cloud using the

123

1514

S. K. Panda, P. K. Jana

Equation in line 13. The only difference is that it calculates the CRT instead of EST
as the BE task is executed when there is no AR task. The task is scheduled to the
minimum completion cloud (Line 17). The cloud ready time is updated (Line 18). The
successors of the task are added into Q and Q A R or Q B E (represented as Q X X ) by
calling Procedure ADD-SUCCESSORS (Line 110 of Procedure 2). The completed
task is removed from the queue Q (Line 20). The while loop iterates until Q is empty
(Line 121).
Note that MCC follows similar principle of task allocation as the minimum completion time (MCT) [1012] algorithm in grid computing which is reflected by Line
1214 in the pseudo code of MCC (Fig. 2). However, there are many differences: (1)
the MCT does not consider the precedence among tasks, (2) MCT follows immediate
mode in contrast to AR or BE as considered by MCC and thus MCT schedules only
one task at a time where as MCC supports more than one tasks at a time and (3)
MCT is a non-preemptive scheduling but MCC is a preemptive scheduling. The only
difference between MCC and Li et al. [16] (CLS) is that CLS does not consider cloud
ready time whereas MCC does so. Therefore, MCC results in better makespan than
the CLS algorithm.
Theorem 4.1 The overall time complexity of the algorithm MCC is O(lm)
Proof Let l be the number of tasks and m is the number of clouds. Then, each of the
Steps 2 to 4 of the algorithm MCC requires O(1) time. Steps 5 to 8 require O(m)
time. Step 9 (i.e., Procedure 1) requires O(1) time. Again Steps 10 to 11 require O(1)
time. Steps 12 to 15 also require O(m) time. Each of the Steps 16 to 18 requires O(1)
time and Step 19 requires O(|Q T |) O(l) time. Step 20 requires O(1) time and
Step 1 is iterated l times. Therefore MCC requires O(lm) time.
Remark 4.1 Note that the overall time complexity of the algorithm MCC is same as
cloud list scheduling (CLS) [16]. However, as discussed, it has better makespan than
the CLS algorithm and we establish this fact through simulation run (refer Sect. 6).
4.2 MEdian MAX scheduling
MEdian MAX(MEMAX) is a two-phase scheduling algorithm. In the first phase, it
calculates the median of execution time of all ready tasks over all the clouds and
then in the second phase, it selects that task which has maximum median value and
assigns it to that cloud which takes minimum execution time. Consider the DAGs in
Fig. 1 and the ETC matrix in Table 1 (refer Sect. 3), the corresponding Gantt chart for
four clouds is shown in Table 4. The entry task A has median 5 followed by maximum
execution time of 5 units of time as task A is the only task which is ready for execution.
However, task A is assigned to cloud 3 as the minimum completion time is achieved
in that cloud. Next, its successor tasks B, C and D are ready for execution. Medians
of their execution times over the clouds are 3.5, 8.5 and 6 respectively in which 8.5
is the maximum one. Therefore, task C is assigned to minimum completion cloud 4.
Similarly, we assign task D and B to cloud 3 and 2 respectively. In MEMAX, task F
preempts the task C from cloud 4 at 10 units of time as task F is an AR task. Similarly,

123

Efficient task scheduling algorithms

1515

Table 4 The scheduling sequence in MEMAX


Time
Cloud 1

010
*

1013
G

1315
*

1521
K

2132
*

3235
O

Time

04

46

614

1416

1618

1821

2129

2932

3235

Cloud 2

2935

Time

04

46

616

1629

Cloud 3

Time

04

410

1014

1415

1521

2123

2329

2935

Cloud 4

Table 5 Task mapping in


MEMAX

Tasks

A C D B G F H C E K I J E L M P N O

Clouds 3 4 3 2 1 4 2 4 2 1 2 3 2 2 4

4 2 1

task I preempts the task E from cloud 2 at 16 units of time as task I is an AR task.
The cloud makespan is 35 units of time. The average cloud execution time is 31.25
units of time which is calculated by averaging makespan of each cloud [i.e., (35 + 32
+ 29 + 29) / 4]. The average cloud utilization is 0.89 which is calculated by dividing
average cloud execution time by makespan [i.e., (31.25 / 35)]. The corresponding taskcloud mapping is shown in Table 5. The main merit of MEMAX is that it balances the
trade-off between makespan and average cloud utilization.
Remark 4.2 Note that MinMin (or MaxMin) algorithm underperforms if the dataset
is positively skewed (or negatively skewed). We say that the dataset is positively
skewed if the mass of the distribution is concentrated on the left. For example, the
dataset containing the completion times 1, 2, 3, 4 and 100 is positively skewed. On the
contrary, the dataset is negatively skewed if the mass of the distribution is concentrated
on the right. For example, the completion times 1, 101, 102, 103 and 104 are negatively
skewed. Note that these completion times are outcome of the first phase of MinMin
(or MaxMin) algorithm. The primary reason behind is that both MinMin and Max
Min select minimum completion time in the first phase and in the second phase, while
MinMin selects minimum execution time, MaxMin selects maximum execution
time. As a result, while MinMin algorithm gives poor average cloud utilization for
the positively skewed execution time tasks, MaxMin produces poor makespan for
negatively skewed execution time tasks. The proposed MEMAX algorithm balances
between makespan and average cloud utilization and tries to achieve trade-off between
them.
4.2.1 Pseudo code for MEMAX
For the pseudo code of MEMAX, we use the same terminologies as in MCC. However,
we use some extra terminologies defined as follows.

123

1516

S. K. Panda, P. K. Jana
Notation

Definition

MCXX (ij, k)

Minimum completion for AR or BE task j of application i on cloud k

ETCAR (ij, k)

Expected time to compute AR task j of application i on cloud k

ETCBE (ij,k)

Expected time to compute BE task j of application i on cloud k

mid

Finds median of the completion times

max

Finds maximum of all the median values

The pseudo code for MEMAX is shown in Fig. 3. Like MCC, this algorithm uses
four queues: Q, Q A R , Q B E and Q T . However, this algorithm selects the tasks in any
order instead of FCFS order. The reason is that it select a best task among all. Therefore,
it seems like a batch of task executed in any order. The Procedure 3 and Procedure 4
are called to execute AR or BE tasks, placed in ETCAR or ETCBE respectively (Line
35). The cloud provider serves the AR task first. Then, it calculates earliest start time
followed by the completion time of the tasks (Line 28 of Procedure 3) followed by
the median of the execution time of individual task (Line 911). We use the following
formula to calculate the median of execution times of the tasks over the clouds.


MC X X S i, m+1
if m = odd
2
(2)
median =
m
m
MC X X S (i, 2 )+MC X X S (i, 2 +1) Otherwise
2
It selects the maximum of all medians (Line 1215) and stores its index i.e., temp.
Next, it finds the cloud that takes minimum completion time (Line 16). The Procedure
1 of MCC is called to preempt the BE tasks before the task is assigned to the cloud
(Line 17). If BE tasks are scheduled in the requested slot, then it will be preempted
and placed in Q B E (Line 15 of Procedure 1). Then, it assigns the task to minimum
completion cloud. At last, Procedure 2 of MCC is called to add the successor(s) of the
executed task (Line 110 of Procedure 2) into Q and Q A R . In the similar fashion, it
executes the BE tasks (Line 122 of Procedure 4). The completed task is removed from
the queues i.e., Q, Q X X and cloud ready time as well as l1 or l2 values are updated
(Line 2122 of Procedure 3 and Line 1920 of Procedure 4). The Procedure 3 (or
Procedure 4) while loop iterates until the l1 (or l2 ) is zero (Line 124 of Procedure 3
and Line 122 of Procedure 4) and main algorithm MEMAX while loop iterates until
the Q is empty (Line 17).
Theorem 4.2 The overall time complexity of the algorithm MEMAX is O(kl 2 m).
Proof Steps 28 of the Procedure 3 and Steps 27 of the Procedure 4 require O(l1 m 1 )
and O(l2 m 2 ) time. However, Step 7 of Procedure 3 and Step 6 of Procedure 4 require
O(m 1 logm 1 ) and O(m 2 logm 2 ) time. Steps 911 of the Procedure 3 and Steps 810 of
the Procedure 4 also require O(l1 m 1 ) and O(l2 m 2 ) time. Steps 1215 of the Procedure
3 and Steps 11 to 14 of the Procedure 4 require O(l1 ) and O(l2 ) time and Step 16
of the Procedure 3 and Step 15 of the Procedure 4 require O(m 1 ) and O(m 2 ) time.
Steps 1719 of the Procedure 3 and Steps 1617 of the Procedure 4 require O(1) time.
Step 20 of Procedure 3 and Step 18 of Procedure 4 requires O(|Q T |) O(l)
time assuming l = max(l1 , l2 ). Steps 2122 of the Procedure 3 and Steps 1920

123

Efficient task scheduling algorithms

1517

Algorithm: MEMAX

1. while Q NULL do
2.
if QAR NULL
3.
Call SCHEDULE-TASKS-MEMAX(ETCAR, l1, m1)
4.
else
5.
Call SCHEDULE-TASKS-MEMAX(ETCBE, l2, m2)
6.
endif
7. endwhile
Procedure 3: SCHEDULE-TASKS-MEMAX(ETCAR, l1, m1)

1. while l1 0 do
2.
for i = 1, 2,, l1
3.
Find EST(i)
4.
for k = 1, 2,, m1
5.
MCAR(i, k) = ETCAR(i, k) + EST(i)
6.
endfor
7.
Sort the row i of MCAR in ascending order and place in MCARS
8.
endfor
9.
for i = 1, 2,, l1
10.
mediani = mid ( MCARS (i, k )) , 1 k m1
k

11.
12.
13.

endfor
for i = 1, 2,, l1
maximum = max(mediani )

14.
temp = i
15.
endfor
16.
Find the cloud l that gives minimum MCAR(temp, l)
17.
Call PREEMPT-BE-TASKS(EST(temp), MCAR(temp, l), l)
18.
Assign temp to cloud l
19.
CRT(l) = CRT(l) + ETCAR(temp, l)
20.
Call ADD-SUCCESSORS(temp)
21.
Remove temp from Q and QAR
22.
l1 = l1 - 1
23. endwhile
24. Return
Fig. 3 Pseudo code for MEMAX algorithm

of Procedure 4 requireO(1) time. As Step 1 iterates l1 times for Procedure 3 and


l2 times for Procedure 4, it requires O(l12 m1 ) andO(l22 m 2 ), respectively. As the
main algorithm MEMAX invokes SCHEDULE-TASKS-MEMAX k times i.e., |Q|, the
overall time complexity of MEMAX is O(kl2 m) assuming l = max(l1 , l2 ) and m =
max(m 1 , m 2 ) which is same as cloud MinMin scheduling (CMMS) [16].
4.3 Cloud MinMax normalization scheduling
Cloud MinMax Normalization (CMMN) is also a two-phase scheduling. The basic
objective of this algorithm is to improve makespan. We observe that the given ETC
matrix may contain some inconsistent element that may have impact on the makespan.
By inconsistent an element we mean an odd element in the ETC matrix which is far

123

1518

S. K. Panda, P. K. Jana
Procedure 4: SCHEDULE-TASKS-MEMAX(ETCBE, l2, m2)

1. while l2 0 do
2.
for i = 1, 2,, l2
3.
for k = 1, 2,, m2
4.
MCBE(i, k) = ETCBE(i, k) + CRT(k)
5.
endfor
6.
Sort the row i of MCBE in ascending order and place in MCBES
7.
endfor
8.
for i = 1, 2,, l2
9.
mediani = mid ( MCBES (i, k )) , 1 k m2
k

10.
11.

endfor
for i = 1, 2,, l2

12.

maximum = max(mediani )

13.
temp = i
14. endfor
15. Find the cloud l that gives minimum MCBE(temp, l)
16. Assign temp to cloud l
17. CRT(l) = CRT(l) + ETCBE(temp, l)
18. Call ADD-SUCCESSORS(temp)
19. Remove temp from Q and QBE
20. l2 = l2 - 1
21. endwhile
22. Return
Fig. 3 continued

away from other elements. For example, in the set of execution time {1, 2, 3, 101},
101 is an inconsistent element. Therefore, we normalize the ETC matrix as follows.
The proposed CMMN algorithm normalizes the dataset (ETC matrix) into a range
from the maximum value to the minimum one in the dataset. For example, maximum
and minimum execution times of the ETC matrix shown in Table 1 are 19 and 2 units
of time respectively when all the tasks are arrived at the same time. Otherwise, the
maximum and minimum execution times are calculated for each group of tasks that are
ready at a time. In the worst case, each group contains only one task. The normalized
ETC matrix is given in Eq. 3 which is built by considering this worst case as follows.

N E T Ci j,k

123

T11

T12
A1
..

T1 p1
..
=
.
Tn1

Tn2
An
..

Tnpn

C1

N E T C11,1

N E T C12,1

..

N
E
T
C1 p1 ,1

..
.

N E T Cn1,1

N E T Cn2,1

..

N E T Cnpn ,1

C2
N E T C11,2
N E T C12,2
..
.

..
.

Cm
N E T C11,m
N E T C12,m
..
.

N E T C1 p1 ,2 N E T C1 p1 ,m
..
..
..
.
.
.
N E T Cn1,2 N E T Cn1,m
N E T Cn2,2 N E T Cn2,m
..
..
..
.
.
.
N E T Cnpn ,2 N E T Cnpn ,m

(3)

Efficient task scheduling algorithms

1519

Table 6 The scheduling sequence in CMMN


Time

010

1013

1315

1521

2132

3235

Cloud 1

Time

04

46

614

1416

1618

1821

2129

2932

3235

Cloud 2

Time

04

46

616

1629

2935

Cloud 3

Time

04

410

1014

1415

1521

2123

2329

2935

Cloud 4

The normalized ETC matrix element NETCi j ,k is formed by taking the ratio between the
difference of ETCi j ,k and the minimum execution time of task Ti j and the difference of
maximum and minimum execution time of task Ti j i.e.,

N E T Cij,k =

ETC ij,k Minimumij


, 1 i n, 1 j pn , 1 k m
Maximumij Minimumij

(4)

In the proposed CMMN algorithm, we categorize the normalized dataset into two batches
i.e., small-batch and large-batch. The categorization depends on a threshold value 1 .
Finding the threshold values is a trade-off. After formation of the batches, we apply
MinMin algorithm to large batch followed by small batch because this algorithm is
a benchmark for two-phase scheduling [8]. Therefore, the large tasks are scheduled
before the small tasks. It overcomes the load imbalance problem as inconsistent tasks are
assigned first instead of consistent tasks. Given the DAGs of Fig. 1 and the ETC matrix
in Table 1 (refer Sect. 3), the corresponding Gantt chart for four clouds is produced in
Table 6 which is same as MEMAX.
The method is illustrated as follows. The entry task A has minimum and maximum
execution time of 4 and 9 units of time respectively. Therefore, the execution time of task
A is normalized to 1, 0.2, 0 and 0.8 respectively. Let us assume that the threshold value as
0.5. So, task Ais kept in large-batch as maximum normalized value of task Ai.e., 1 > 0.5
and A is assigned to cloud 3 as the minimum completion time is achieved in that cloud.
Next, the successors of task A that is task B, task C and task D are ready for scheduling.
Maximum normalized value of these tasks is 0.875, 1 and 1 respectively. Therefore, they
are kept in large-batch and scheduled in the order of B, D and C, respectively. Again,
consider a scenario when task I and task J are ready for execution. Here, task I is
assigned to small-batch whereas task J is assigned to large-batch as the minimum and
maximum execution time are 2 and 19, respectively. The remaining tasks are assigned
similarly. Note that task F preempts the task C from cloud 4 at 10 units of time as task
F is an AR task. Similarly, task I preempts the task E from cloud 2 at 16 units of time as
task I is an AR task. The cloud makespan is 35 units of time. The average cloud execution
time and average cloud execution time is 31.25 units of time and 0.89, respectively. The
corresponding task-cloud mapping is shown in Table 7.

123

1520

S. K. Panda, P. K. Jana

Table 7 Task mapping in


CMMN

Tasks

A B D C G F H C E K I J E L M P N O

Clouds 3 2 3 4 1 4 2 4 2 1 2 3 2 2 4

4 2 1

4.3.1 Pseudo code for CMMN


The extra terminologies used in the pseudo code of CMMN are defined as follows.
Notation

Definition

NETCAR(ij, k)

Normalized matrix for AR task j of application i on cloud k

NETCBE(ij, k)

Normalized matrix for BE task j of application i on cloud k

Algorithm: CMMN

1. while Q NULL do
2.
if QAR NULL
3.
Call SCHEDULE-TASKS-CMMN(ETCAR, l1, m1)
4.
else
5.
Call SCHEDULE-TASKS-CMMN(ETCBE, l2, m2)
6.
endif
7. endwhile
Procedure 5: SCHEDULE-TASKS-CMMN(ETCAR, l1, m1)

1. Find key1i = min( ETCAR (i, k )) , 1

2. Find Minimum = min(key1i ) , 1


i

i
k

l1

m1 , 1

l1

l1

3. Find key2i = max( ETCAR(i, k )) , 1


4. Find Maximum = max(key2i ) , 1

m1 , 1

l1

5. for i = 1, 2,, l1
6.
for k = 1, 2,, m1
7.

NETCAR(i, k) =

ETCAR (i, k) - Minimum


Maximum - Minimum

8. endfor
9. endfor
10. Set k1 = 1 and k2 = 1
11. for i = 1, 2,, l1
12. Find maxkey = max( NETCAR (i, k )) , 1
k

m1

13. if maxkey 1
14.
small-batch(k1) = i
15.
k1 = k1 + 1
16. else
17.
large-batch(k2) = i
18.
k2 = k2 + 1
19. endif
20. endfor
21. Apply Min-Min algorithm to large-batch and small-batch respectively
22. Return
Fig. 4 Pseudo code for CMMN algorithm

123

Efficient task scheduling algorithms

1521

Procedure 6: SCHEDULE-TASKS-CMMN(ETCBE, l2, m2)

1. Find key1i = min( ETCBE (i, k )) , 1

2. Find Minimum = min(key1i ) , 1


i

l2

m2 , 1

l2

l2

3. Find key2i = max( ETCBE (i, k )) , 1


4. Find Maximum = max(key2i ) , 1

m2 , 1

l2

5. for i = 1, 2,, l2
6.
for k = 1, 2,, m2
7.

NETCBE(i, k) =

ETCBE (i, k) - Minimum


Maximum - Minimum

8. endfor
9. endfor
10. Set k1 = 1 and k2 = 1
11. for i = 1, 2,, l2
12. Find maxkey = max( NETCBE (i, k )) , 1
k

m2

13. if maxkey 1
14.
small-batch(k1) = i
15.
k1 = k1 + 1
16. else
17.
large-batch(k2) = i
18.
k2 = k2 + 1
19. endif
20. endfor
21. Apply Min-Min algorithm to large-batch and small-batch respectively
22. Return
Fig. 4 continued

The pseudo code for CMMN is shown in Fig. 4. Like MCC and MEMAX, this
algorithm uses four queues. But unlike MCC, this algorithm executes the request in
any order instead of FCFS order. First, this algorithm finds the maximum and minimum
value in the ETC matrix. The reason is to normalize the dataset into a specified range
(Line 14 of Procedure 5 and Procedure 6). Then it normalizes each element using Min
Max normalization (Line 59) and it classifies the task into small-batch and large-batch,
respectively. For this, it uses a threshold values i.e., 1 (Line 1120). Determining the
optimal value of these thresholds is a trade-off. However, we are getting optimal value
in the range of 0.2 to 0.5, respectively. Finally, it applies MinMin algorithm to execute
large-batch and small-batch, respectively (Line 21). The main Procedure CMMN while
loop iterates until the queue Q is empty (Line 17). Note that, Procedure 1 and 2 are
used by MinMin algorithm to schedule the tasks. However, we have not shown here for
simplicity.
Theorem 4.3 The overall time complexity of the algorithm CMMN is O(kl2 m).

123

1522

S. K. Panda, P. K. Jana

Proof To find the maximum and minimum element in the ETCAR and ETCBE matrix,
it requires O(l1 m 1 ) and O(l2 m 2 ) time (Line 14 of the Procedure 5 and Procedure 6).
Again, O(l1 m 1 ) and O(l2 m 2 ) time are required to normalize the ETC matrix (Line 5 to
9). Step 10 of Procedure 5 and Procedure 6 requires O(1) time. To place the tasks into
one of the batch requires O(l1 m 1 ) and O(l2 m 2 ) time in Procedure 5 and 6, respectively
(Line 11 to 20). But MinMin algorithm takes O(l12 m 1 ) and O(l22 m 2 ) time in Procedure
5 and 6 (Line 21) [812,30]. So, the overall time complexity of CMMN is O(kl2 m) time
as CMMN invokes SCHEDULE-TASKS-CMMN k times assuming l = max(l1 , l2 ) and m
= max(m 1 , m 2 ). This is same as CMMS [16].

5 Performance metrics
For the evaluation of the performance of the proposed algorithms, we use two parameters,
viz., cloud makespan and average cloud utilization which are briefly described as follows.
Note that makespan and average cloud utilization are usually used in grid computing;
but we extend them to Cloud Computing scenario.
A. Makespan (M)
The cloud makespan (M) is the overall completion time needed to execute all the tasks
by the available clouds [30]. Let M(Ck ) be the makespan of cloud k, 1 k m,
|Ai | be the total number of tasks of the application Ai , 1 i n, ST(ij) be the start
time of task j of application i, CTL(k) be the completion time of last task on cloud k,
We assume that CTL(k) = 0 initially. F(ij, k) be a Boolean variable defined as follows

F(i j, k) =

1
0

if Ti j is assigned to Ck
Otherwise

(5)

Then the individual makespan of each cloud Ck can be mathematically expressed as


follows.
M(Ck ) =

|Ai |
n

E T C(i j, k) F(i j, k) + (ST (i j) C T L(k)) F(i j, k)

(6)

i=1 j=1

Therefore the overall makespan over all the clouds is


M = max(M(Ck )), 1 k m

(7)

It is noteworthy that for an algorithm to be optimum, the overall makespan should be


least.
B. Average cloud utilization(U )
Cloud utilization is the ratio of makespan of the cloud and the overall makespan M.
The average cloud utilization (U ) is then defined as the average utilization of all the
clouds. It ranges from 0 to 1.
Mathematically,
m
U (Ci )
U = i=1
(8)
m

123

Efficient task scheduling algorithms

1523

where U (Ci ) denotes the utilization of cloud Ci . Note that a load balanced schedule
should have maximum average cloud utilization [30].

6 Experimental results
We evaluate the proposed algorithms through simulation run with benchmark and synthetic datasets. The experiments were carried out using MATLAB R2012a version
7.14.0.739 on an Intel Core 2 Duo processor, 2.20 GHz CPU and 4 GB RAM running
on Microsoft Windows 7 platform.

6.1 Simulation results on benchmark dataset


In this simulation, we considered two different benchmark datasets generated by Braun et
al. [8,31]. Each dataset has 12 instances. The first dataset contains 512 tasks to be scheduled to 16 different clouds and we denote it by 512 16. The second dataset contains 1024
tasks to be scheduled to 32 different clouds and we denote it by 1024 32. The general
structure of the dataset instances is expressed in the form of u_x_yyzz. Here, u denotes
that uniform distribution to generate the instances, x denotes the type of consistency [i.e.,
consistent (c), inconsistent (i) or semi-consistent (s)], yy shows the task heterogeneity
[i.e., high (hi) or low (lo)] and zz shows the machine heterogeneity [i.e., high (hi) or
low (lo)]. Therefore, this structure makes 12 different instances (i.e., u_c_hihi, u_c_hilo,
u_c_lohi, u_c_lolo, u_i_hihi, u_i_hilo, u_i_lohi, u_i_lolo, u_s_hihi, u_s_hilo, u_s_lohi
and u_s_lolo). This is important to note that these instances of the datasets are used in
task scheduling as proposed in [29,30]. We incorporated both AR or BE application tasks
by assuming first 50 % tasks were AR and the next 50 % tasks were BE. However, for
the sake of simplicity of simulation run, we assumed zero arrival time for all the tasks to
avoid preemption of any task.
The makespan of the proposed algorithm MCC is calculated for both 512 16 and
1,024 32 datasets and compared with that of RR and CLS scheduling as shown in Table
8. For the sake of easy visualization, the graphical comparisons of makespan for 512
16 and 1024 32 dataset are also separately shown in Figs. 5 and 6, respectively. The
comparison results clearly show that the proposed algorithm gives better makespan than
the RR and CLS scheduling for all the 12 instances of both the datasets. The rationale
behind it is that the MCC assigns the tasks to the available clouds as per the minimum
completion time. Therefore, the ready time of each cloud is calculated after each task
assigned and thus it results least makespan.
The average cloud utilization of RR, CLS and MCC algorithm are also calculated
for both the benchmark datasets which is jointly shown in Table 9. Note that for 8
instances out of 12 (about 67 %) of both the datasets, the average cloud utilization of
the proposed algorithm MCC is better than that of the RR and CLS scheduling. The
rationale behind is that the MCC distributes the tasks to the available cloud based on the
minimum completion time. Hence the makespan over all the clouds available increases
with respect to the ready time.
We next ran the proposed algorithm MEMAX to calculate the makespan and the
average cloud utilization. The results are compared with CMAXMS (i.e, MaxMin [10])

123

123

3.4409e+07

u_i_hihi

4.4659e+05

1.1974e+06

1.5397e+04

u_s_lohi

u_s_lolo

u_s_hihi

u_s_hilo

1.1034e+04

4.1530e+07

u_i_lolo

3.0365e+05

1.8132e+04

u_c_lolo

1.0015e+06

1.5174e+06

u_c_lohi

u_i_lohi

4.8937e+05

u_i_hilo

4.6589e+07

u_c_hilo

RR (512 16)

Cloud makespan

u_c_hihi

Instance

2.1042e+04

6.7469e+05

6.0536e+05

2.5162e+07

3.3993e+03

1.8569e+05

9.6610e+04

4.5085e+06

3.9582e+04

1.4531e+06

1.1851e+06

4.7472e+07

CLS (512 16)

4.4361e+03

1.8615e+05

1.2659e+05

6.6939e+06

3.1374e+03

1.4382e+05

9.4856e+04

4.4136e+06

6.3601e+03

3.7830e+05

1.8589e+05

1.1423e+07

MCC (512 16)

Table 8 Comparison of cloud makespan for RR, CLS and MCC algorithm in benchmark dataset

1.5880e+03

1.5227e+04

1.6722e+07

1.5160e+08

1.0733e+03

9.9151e+03

9.6486e+06

1.0113e+08

1.4582e+03

1.9752e+04

1.6570e+07

1.6070e+08

RR (1024 32)

8.0161e+02

8.3377e+03

8.0988e+06

8.4821e+07

9.1120e+01

8.5439e+02

7.6598e+05

7.4620e+06

1.5675e+03

1.4151e+04

1.5504e+07

1.5447e+08

CLS (1024 32)

1.9423e+02

1.8220e+03

1.8255e+06

1.9008e+07

7.2390e+01

7.5410e+02

7.1313e+05

7.5671e+06

3.2628e+02

3.0587e+03

3.2458e+06

3.2833e+07

MCC (1024 32)

1524
S. K. Panda, P. K. Jana

Efficient task scheduling algorithms

10
10

Cloud

Makespan

10
10
10
10
10
10
10

1525

RR

CLS

MCC

u_c_hihi u_c_hilo u_c_lohi u_c_lolo u_i_hihi u_i_hilo u_i_lohi u_i_lolo u_s_hihi u_s_hilo u_s_lohi u_s_lolo
Instances

Fig. 5 Graphical comparison of makespan for RR, CLS and MCC using 512 16 benchmark dataset
9

10

10

Makespan
Cloud

10

10

10

10

10

10

10

RR

CLS

MCC

u_c_hihiu_c_hilou_c_lohiu_c_lolou_i_hihiu_i_hilou_i_lohiu_i_lolou_s_hihiu_s_hilou_s_lohiu_s_lolo
Instances

Fig. 6 Graphical comparison of makespan for RR, CLS and MCC using 1024 32 benchmark dataset

algorithm as shown in Table 10 for both the 512 16 and 1024 32 benchmark datasets.
This is also shown using bar charts in Fig. 7. The comparison results clearly show that 12
out of 12 instances (i.e., 100 %) in both the dataset give better makespan for the proposed
algorithm MEMAX than the CMAXMS algorithm. The result for MEMAX also shows
that the cloud utilization value is much closer to that of the CMAXMS and even better
for some instances.

123

123

0.57

0.58

0.58

0.61

u_s_hilo

u_s_lohi

u_s_lolo

u_i_lolo

u_s_hihi

0.82

0.81

u_i_lohi

0.69

0.84

u_i_hilo

u_c_lolo

u_i_hihi

0.53

0.50

u_c_lohi

0.51

0.54

u_c_hilo

RR (512 16)

Average cloud utilization

u_c_hihi

Instance

0.22

0.21

0.21

0.19

0.74

0.53

0.75

0.62

CLS (512 16)

0.95

0.95

0.93

0.92

0.96

0.94

0.95

0.93

0.95

0.96

0.97

0.95

MCC (512 16)

0.50

0.50

0.47

0.52

0.72

0.80

0.78

0.81

0.54

0.40

0.48

0.50

RR (1024 32)

Table 9 Comparison of average cloud utilization for RR, CLS and MCC algorithm in benchmark dataset

0.06

0.06

0.06

0.05

0.52

0.57

0.60

0.66

CLS (1024 32)

0.90

0.89

0.93

0.91

0.91

0.91

0.91

0.91

0.95

0.92

0.94

0.93

MCC (1024 32)

1526
S. K. Panda, P. K. Jana

4.9162e+03

8.3669e+06

1.6734e+05

2.5837e+05

6.0724e+03

u_s_hihi

u_s_hilo

u_s_lohi

u_s_lolo

u_i_lohi

u_i_lolo

1.4651e+05

2.4031e+05

u_i_hilo

6.8742e+03

6.6097e+06

u_i_hihi

u_c_lohi

u_c_lolo

2.0117e+05

4.0824e+05

u_c_hilo

1.2225e+07

4.0019e+03

1.6484e+05

1.1254e+05

5.9452e+06

3.0180e+03

1.3350e+05

8.8422e+04

3.8713e+06

5.8148e+03

3.3023e+05

1.7455e+05

9.5814e+06

2.2141e+02

2.1427e+03

2.0914e+06

2.2572e+07

1.1777e+02

1.1700e+03

1.1827e+06

1.2103e+07

3.2460e+02

3.0955e+03

3.2470e+06

3.2791e+07

1.6663e+02

1.6399e+03

1.6112e+06

1.7306e+07

6.6890e+01

7.0034e+02

6.7826e+05

6.7276e+06

2.7762e+02

2.6298e+03

2.6777e+06

2.6956e+07

MEMAX
(1024 32)

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.98

0.99

0.99

0.99

0.99

MEMAX
(512 16)

CMAXMS
(512 16)

CMAXMS
(1024 32)

CMAXMS
(512 16)

MEMAX (512
16)

Average cloud utilization

Cloud makespan

u_c_hihi

Instance

Table 10 Comparison of cloud makespan and average cloud utilization for CMAXMS and MEMAX algorithm in benchmark dataset

0.99

0.99

0.99

0.99

0.99

0.99

0.99

0.98

0.99

0.99

0.99

0.99

CMAXMS
(1024 32)

0.91

0.93

0.85

0.89

0.89

0.84

0.80

0.83

0.97

0.94

0.96

0.97

MEMAX
(1024 32)

Efficient task scheduling algorithms


1527

123

1528
10

10

Cloud

Makespan

10

10

10

10

10

10

10

S. K. Panda, P. K. Jana
8

CMAXMS (51216)

u_c_hihi

u_c_hilo

u_c_lohi

u_c_lolo

MEMAX (51216)
u_i_hihi

u_i_hilo u_i_lohi
Instanc es

CMAXMS (102432)
u_i_lolo

u_s_hihi

u_s_hilo

MEMAX (102432)
u_s_lohi

u_s_lolo

Fig. 7 Graphical comparison of makespan for CMAXMS and MEMAX using both 512 16 and 1024
32 benchmark dataset

Similarly the makespan and average cloud utilization of the proposed algorithm
CMMN are compared with that of CMMS (i.e., MinMin [10]) and shown in tabular
and graphical forms in Table 11 and Fig. 8, respectively. In this case also, the proposed
CMMN algorithm outperforms the CMMS for both the benchmark datasets.

6.2 Simulation with synthetic dataset


In this simulation, we took a dataset consisting of ten instances. These instances are
generated by MATLAB using uniform distributed pseudorandom integer function randi().
It returns a two-dimensional array of size [l, m] containing integers on the given interval
[imin, imax]. For each instance, the first value denotes the number of tasks (l) and second
value denotes the number of clouds (m) and we denote it by l m. Here also, we
incorporated the first 50 % tasks as AR and the next 50 % tasks in BE and assumed
zero arrival time for all the tasks. The comparison of both makespan and average cloud
utilization of RR, CLS with MCC are jointly shown in Table 12. The comparison results
clearly show that the proposed algorithm MCC gives better makespan and average cloud
utilization than the RR and CLS for all instances of the dataset.
Makespan and average cloud utilization of CMAXMS are also compared with the
proposed MEMAX which are shown in Table 13. It is obvious to see that for all instances,
MEMAX results better makespan than CMAXMS. However, average cloud utilization
value is much closer to that of the CMAXMS.
We also compare makespan and average cloud utilization of the proposed algorithm
CMMN with that of CMMS as shown in Table 14 which clearly demonstrate that the
proposed CMMN performs better than the CMMS.

123

2.7496e+03

5.4016e+06

1.0413e+05

1.5230e+05

3.9246e+03

u_s_hihi

u_s_hilo

u_s_lohi

u_s_lolo

u_i_lohi

u_i_lolo

8.2294e+04

1.2324e+05

u_i_hilo

5.5095e+03

3.4773e+06

u_i_hihi

u_c_lohi

u_c_lolo

1.6383e+05

2.8832e+05

u_c_hilo

8.9960e+06

3.6772e+03

1.3894e+05

1.0118e+05

4.9277e+06

2.6787e+03

1.1431e+05

7.7264e+04

3.3195e+06

5.3703e+03

2.7438e+05

1.5981e+05

8.4291e+06

1.4663e+02

1.4373e+03

1.3961e+06

1.4317e+07

6.0490e+01

6.8456e+02

6.5330e+05

6.2904e+06

2.2688e+02

2.1990e+03

2.2742e+06

2.2290e+07

1.3885e+02

1.3530e+03

1.3686e+06

1.3857e+07

5.5760e+01

5.8110e+02

5.6588e+05

6.0217e+06

2.2053e+02

2.1118e+03

2.2133e+06

2.1877e+07

CMMN (1024
32)

0.90

0.81

0.93

0.82

0.94

0.87

0.91

0.89

0.95

0.88

0.95

0.87

0.97

0.94

0.98

0.92

0.98

0.96

0.98

0.95

0.98

0.96

0.98

0.97

CMMN (512
16)

CMMS (512
16)

CMMS (1024
32)

CMMS (512
16)

CMMN (512
16)

Average cloud utilization

Cloud makespan

u_c_hihi

Instance

Table 11 Comparison of cloud makespan and average cloud utilization for CMMS and CMMN algorithm in benchmark dataset

0.79

0.78

0.79

0.81

0.84

0.78

0.76

0.83

0.89

0.85

0.88

0.89

CMMS (1024
32)

0.92

0.93

0.92

0.93

0.94

0.94

0.94

0.94

0.97

0.95

0.96

0.97

CMMN (1024
32)

Efficient task scheduling algorithms


1529

123

1530
10

10

Cloud

Makespan

10

10

10

10

10

10

10

S. K. Panda, P. K. Jana
8

CMMS (51216)

u_c_hihi

u_c_hilo

u_c_lohi

u_c_lolo

u_i_hihi

CMMN (51216)
u_i_hilo
u_i_lohi
Instanc es

CMMS (102432)

u_i_lolo

u_s_hihi

u_s_hilo

CMMN (102432)
u_s_lohi

u_s_lolo

Fig. 8 Graphical comparison of makespan for CMMS and CMMN using both 512 16 and 1024 32
benchmark dataset
Table 12 Comparison of cloud
makespan and average cloud
utilization for RR, CLS and
MCC algorithm in synthetic
dataset

Instance

Cloud makespan

Average cloud utilization

RR

CLS

MCC

RR

CLS

MCC

100 4

26,878

28,071

22,445

0.70

0.65

0.84

200 8

26,878

30,601

21,754

0.66

0.55

0.80

300 12

22,195

32,901

18,899

0.79

0.51

0.91

400 16

26,986

36,508

21,397

0.63

0.44

0.78

500 20

26,953

35,766

21,430

0.63

0.45

0.77

600 24

26,886

42,933

21,339

0.63

0.37

0.77

700 28

26,986

28,108

21,307

0.62

0.57

0.76

800 32

26,792

25,437

21,225

0.62

0.62

0.76

900 36

26,849

31,393

21,179

0.62

0.50

0.76

1,000 40 26,787

36,533

21,231

0.62

0.43

0.76

7 Conclusion
We have presented three task scheduling algorithms namely, MCC, MEMAX and CMMN
for heterogeneous multi-cloud systems. The MCC is a single-phase scheduling which
has been shown to run in O(lm) time for l tasks and m clouds. The other two algorithms
are two-phase scheduling that are shown to require O(kl2 m) for k iterations. We have
shown the experimental results on two benchmark datasets and one synthetic dataset
and compared the results with as per their applicability. Comparison results have shown
that the proposed algorithms outperform than four existing multi-cloud task scheduling

123

Efficient task scheduling algorithms


Table 13 Comparison of cloud
makespan and average cloud
utilization for CMAXMS and
MEMAX algorithm in synthetic
dataset

Table 14 Comparison of cloud


makespan and average cloud
utilization for CMMS and
CMMN algorithm in synthetic
dataset

1531
Instance

Cloud makespan

Average cloud utilization

CMAXMS

MEMAX

CMAXMS

MEMAX

100 4

19,909

19534

0.96

0.95

200 8

19,442

19,117

0.92

0.91

300 12

19,256

18,921

0.92

0.90

400 16

19,105

18,833

0.89

0.88

500 20

19,030

18,800

0.89

0.87

600 24

19,005

18,773

0.88

0.87

700 28

18,997

18,686

0.88

0.87

800 32

18,907

18,699

0.88

0.86

900 36

18,899

18,665

0.87

0.86

1,000 40 18,859

18,669

0.87

0.86

Instance

100 4

Cloud makespan

Average cloud utilization

CMMS

CMMN

CMMS

CMMN

22,107

19,411

0.84

0.95

200 8

21,369

17,432

0.80

0.99

300 12

21,483

18,656

0.78

0.90

400 16

21,170

18,625

0.77

0.88

500 20

21,102

18,584

0.77

0.87

600 24

21,052

18,578

0.76

0.86

700 28

21,025

18,513

0.76

0.86

800 32

20,913

18,429

0.76

0.86

900 36

20,908

18,362

0.76

0.87

1,000 40

20,874

18,363

0.76

0.86

algorithms namely, RR, CLS, CMMS, CMAXMS in terms of makespan and average
cloud utilization.

References
1. Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging IT
platforms: vision, hype and reality for delivering computing as the 5th utility. Future Gen Comput Syst
Elsevier 25:599616
2. Durao F, Carvalho JFS, Fonseka A, Garcia VC (2014) A systematic review on cloud computing. J
Supercomput 68(3):13211346
3. Rimal BP, Choi E, Lumb I (2009) A taxonomy and survey of cloud computing systems. Fifth international joint conference on INC, IMS and IDC, pp 4451
4. Tsai J, Fang J, Chou J (2013) Optimized task scheduling and resource allocation on cloud computing
environment using improved differential evolution algorithm. Comput Oper Res Elsevier 40(12):3045
3055
5. Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A,
Stoica I, Zaharia M (2009) Above the clouds: a berkeley view of cloud computing. Technical report
no. USB/EECS-2009-28. http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.html

123

1532

S. K. Panda, P. K. Jana

6. Begnum K (2012) Simplified cloud-oriented virtual machine management with MLN. J Supercomput
61(2):251266
7. Ullman JD (1975) NP-complete scheduling problems. J Comput Syst Sci 10(3):384393
8. Braun TD, Siegel HJ, Beck N, Boloni LL, Maheswaran M, Reuther AI, Robertson JP, Theys MD,
Yao B, Hensgen D, Freund RF (2001) A comparison of eleven static heuristics for mapping a class
of independent tasks onto heterogeneous distributed computing systems. J Parallel Distrib Comput
61(6):810837
9. Maheswaran M, Ali S, Siegel HJ, Hensgen D, Freund RF (1999) Dynamic mapping of a class of
independent tasks onto heterogeneous computing systems. J Parallel Distrib Comput 59:107131
10. Ibarra OH, Kim CE (1977) Heuristic algorithms for scheduling independent tasks on nonidentical
processors. J Assoc Comput Mach 24(2):280289
11. Armstrong R, Hensgen D, Kidd T (1998) The relative performance of various mapping algorithms is
independent of sizable variances in run-time predictions. 7th IEEE heterogeneous computing workshop.
pp 7987
12. Freund RF, Gherrity M, Ambrosius S, Campbell M, Halderman M, Hensgen D, Keith E, Kidd T,
Kussow M, Lima JD, Mirabile F, Moore L, Rust B, Siegel HJ (1998) Scheduling resources in multiuser, heterogeneous, computing environments with smartNet. 7th IEEE heterogeneous computing
workshop. pp 184199
13. Kwok Y, Ahmad I (1996) Dynamic critical-path scheduling: an effective technique for allocating task
graphs to multiprocessors. IEEE Trans Parallel Distrib Syst 7(5):506521
14. Topcuoglu H, Hariri S, Wu M (2002) Performance-effective and low-complexity task scheduling for
heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260274
15. Bajaj R, Agrawal DP (2004) Improving scheduling of tasks in a heterogeneous environment. IEEE
Trans Parallel Distrib Syst 15(2):107118
16. Li J, Qiu M, Ming Z, Quan G, Qin X, Gu Z (2012) Online optimization for scheduling preemptable
tasks on IaaS cloud system. J Parallel Distrib Comput Elsevier 72:666677
17. Li J, Qiu M, Niu JW, Chen Y, Ming Z (2010) Adaptive resource allocation for preemptable jobs in cloud
systems (2010) 10th IEEE international conference on intelligent systems design and applications.
pp 3136
18. Wen H, Hai-ying Z, Chuang L, Yang Y (2011) Effective load balancing for cloud-based multimedia
system (2011) International conference on electronic and mechanical engineering and information
technology, pp 165168
19. Wang S, Yan K, Liao W, Wang S (2010) Towards a load balancing in a three-level cloud computing
network. 3rd IEEE international conference on computer science and information technology. vol 1.
pp 108113
20. Ergu D, Kou G, Peng Y, Shi Y, Shi Y (2013) The analytic hierarchy process: task scheduling and
resource allocation in cloud computing environment. J Supercomput Springer 64:835848
21. Rai A, Bhagwan R, Guha S (2012) Generalized resource allocation for the cloud. 3rd ACM symposium
on cloud computing
22. Sotomayor B, Keahey K, Foster I (2008) Combining batch execution and leasing using virtual machines
(2008) 17th international symposium on high performance distributed computing, ACM pp 8796
23. Sotomayor B, Montero RS, Llorente IM, Foster I (2011) Resource leasing and the art of suspending
virtual machines. 11th IEEE international conference on high performance computing and communications. pp 5968
24. Akhani J, Chuadhary S, Somani G (2011) Negotiation for resource allocation in IaaS cloud. 4th annual
ACM bangalore conference
25. Bozdag D, Ozguner F, Catalyurek U (2009) Compaction of schedules and a two-stage approach for
duplication-based DAG scheduling. IEEE Trans Parallel Distrib Syst 20(6):857871
26. Xu Y, Hu H, Yihe S (2010) Data dependence graph directed scheduling for clustered VLIW architectures. Tsinghua Sci Technol IEEE 15(3):299306
27. Bittencourt LF, Madeira ERM, Fonseca NLSD (2012) Scheduling in hybrid clouds. IEEE Commun
Mag 50(9):4247
28. Nathani A, Chaudhary S, Somani G (2012) Policy based resource allocation in IaaS cloud. Future Gen
Comput Syst Elsevier 28:94103
29. Xhafa F, Carretero J, Barolli L, Durresi A (2007) Immediate mode scheduling in grid systems. Int J
Web Grid Serv 3(2):219236

123

Efficient task scheduling algorithms

1533

30. Xhafa F, Barolli L, Durresi A (2007) Batch mode scheduling in grid systems. Int J Web Grid Serv
3(1):1937
31. Braun FN (2014) Accessed on 9 Jan 2014. https://code.google.com/p/hcsp-chc/source/browse/trunk/
AE/ProblemInstances/HCSP/Braun_et_al/u_c_hihi.0?r=93
32. Kwok Y, Ahmad I (1999) Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput Surv (CSUR) 31(4):406471
33. Zhang Y, Sivasubramaniam A, Moreira J, Franke H (2001) Impact of workload and system parameters
on next generation cluster scheduling mechanisms. IEEE Trans Parallel Distrib Syst 12(9):967985
34. Hagras T, Janecek J (2005) A high performance, low complexity algorithm for compile-time task
scheduling in heterogeneous systems. Parallel Comput 31(7):653670
35. Lawler EL, Labetoulle J (1978) On preemptive scheduling of unrelated parallel processors by linear
programming. J Assoc Comput Mach 25(4):612619
36. Liu C, Yang S (2011) A heuristic serial schedule algorithm for unrelated parallel machine scheduling
with precedence constraints. J Softw 6(6):11461153
37. Kumar VSA, Marathe MV, Parthasarathy S, Srinivasan A (2009) Scheduling on unrelated machines
under tree-like precedence constraints. Algorithmica 55:205226
38. Lenstra JK, Shmoys DB, Tardos E (1990) Approximation algorithms for scheduling unrelated parallel
machines. Math Program 46(13):259271
39. Leighton FT, Maggs BM, Rao SB (1994) Packet routing and job-shop scheduling in O (Congestion +
Dilation) steps. Combinatorica 14:167186
40. Smith W, Foster I, Taylor V (2000) Scheduling with advanced reservations. 14th international parallel
and distributed processing symposium. pp 127132
41. Haizea (2014) http://haizea.cs.uchicago.edu/whatis.html. Accessed 9 Jan 2014
42. Rimal BP, Choi E, Lumb I (2009) A taxonomy and survey of cloud computing systems. International
joint conference on INC, IMS and IDC. pp 4451
43. Hou E, Ansari N, Ren H (1994) A genetic algorithm for multiprocessor scheduling. IEEE Trans Parallel
Distrib Syst 5(2):113120
44. Yang Q, Peng C, Zhao H, Yu Y, Zhou Y, Wang Z, Du S (2014) A new method based on PSR and
EA-GMDH for host load prediction in cloud computing system. J Supercomput 68(3):14021417
45. Gil J, Park JH, Jeong Y (2013) Data center selection based on neuro-fuzzy inference systems in cloud
computing environments. J Supercomput 66(3):11941214
46. Zhang F, Cao J, Li K, Khan SU, Hwang K (2014) Multi-objective scheduling of many tasks in cloud
platforms. Future Gen Comput Syst Elsevier 37:309320
47. Su S, Li J, Huang Q, Huang X, Shuang K, Wang J (2013) Cost-efficient task scheduling for executing
large programs in the cloud. Parallel Comput 39(45):177188
48. Wang X, Wang Y, Cui Y (2014) A new multi-objective Bi-level programming model for energy and
locality aware multi-job scheduling in cloud computing. Future Gen Comput Syst Elsevier 36:91101

123

You might also like