You are on page 1of 9

J Ambient Intell Human Comput

DOI 10.1007/s12652-014-0220-4

ORIGINAL RESEARCH

A fault-tolerant architecture for ROIA in cloud


Dong Liu

Received: 16 October 2013 / Accepted: 28 December 2013


 Springer-Verlag Berlin Heidelberg 2014

Abstract Real-time online interactive application


(ROIA) is an emerging large scale distributed application
in recent years. To satisfy the high demands on performance, ROIA needs a highly robust and efficient architecture to cope with the huge concurrent users. Previous
works are almost based on the Client/Server or Peer to Peer
mode, and their scalability and resource utilization are
relatively low. So we try to take advantage of the cloud
computing technologies to achieve higher scalability and
resource utilization. However, the node failure is no longer
considered to be an accidental event, but a normal one in
the cloud computing. Therefore, there are higher requirements for fault tolerance in ROIA. In this paper, we propose a new fault-tolerant architecture for ROIA in cloud,
which is based on cell overlapping technique. This new
architecture provides redundancy to enhance the robustness
and the scalability of ROIA. We focus on analyzing three
merits of new architecture: seamless migration across
zones, server crash protection and dynamic load balance.
This paper sheds a little light to the related further research
on ROIA.
Keywords Cloud  ROIA  MMOG  Overlapping
architecture

1 Introduction
Real-time online interactive application (ROIA) is an
emerging large-scale distributed application. The popular
D. Liu (&)
Department of Computer Science, JiNan University,
Guangzhou 510632, China
e-mail: wze2k@163.com

and market-relevant representatives of ROIA are massively


multi-player online game (MMOG), as well as real-time
training and e-learning based on a high-performance simulation. Compared with other traditional distributed applications, ROIA has its own challenging features (Glinka
et al. 2010; Gorlatch et al. 2012): short response times to
user actions (about 0.11.5 s); frequent state computation
(up to 50 Hz); large and frequently changing number of
users in a single application instance (up to 104
simultaneously).
To satisfy the high demands on ROIA performance,
several key issues must to be addressed. Among these key
issues, a robust and efficient architecture with suitable
scalability is the most important one.
Previously, many scholars have also done wide research
on this issue and achieved some research results. But the
researches are mostly based on the classic C/S and P2P
mode, while few of them are combined with the emerging
cloud computing. These architectures based on the classic
C/S and P2P mode cannot be copied to cloud computing
environment. For example, in cloud computing, a large
number of servers may lead to the increase of overall node
failure probability, so a high failure rate fault tolerance is
essential. However, the traditional ROIA architectures are
not good at dealing with these problems.
In previous work (Liu and Zhao 2013), we proposed the
multi-ROIA cloud platform (MRCP), a new approach to
achieve high scalability in ROIA under Cloud environment
and focused on the overall frame structure and the corresponding load strategy of MRCP. In this paper, we focus on
the architecture of a specific ROIA which can be deployed
on the MRCP or other cloud computing environment.
The rest of paper is organized as follows. Sect. 2
describes related work and some techniques which are used
in our new architecture. After that, Sect. 3 proposes a new

123

D. Liu

overlapping architecture with the robust, efficient and


scalable characters for ROIA in cloud. And Sect. 4 elaborates the merits of new architecture. Finally, Sect. 5 presents the conclusions of this paper.
In addition, since MMOG is the most important and
popular one of ROIA, so we use some terminology and
concepts of MMOG which are also applicable to other
ROIA.

2.1.2 P2P approach


P2P approach mainly takes advantage of the hardware
resource of each peer to achieve a good scalability. The
merits of P2P approach are as follows (Jia et al. 2010; Ling
et al. 2012; Wang and Wang 2012):

2 Related work and technology


In this section, we review the merits and drawbacks of the
traditional ROIA architectures, and describe zoning and
area of interest management (AoIM) which are used in our
new architecture.
2.1 Three traditional approaches
For ROIA, such a large-scale distributed application, the
features determine that a single server is difficult to meet
the performance requirements of ROIA. Therefore, the
researchers made a lot of solutions. These solutions can
be broadly divided into three kinds: client/multi-server
(C/MS) approach, peer to peer (P2P) approach and cloud
approach.
2.1.1 C/MS approach
In C/MS approach, multiple servers, which provide different functionalities, are used on the server side to meet
the high performance requirement. C/MS approach has
some merits, such as:

Authority: C/MS approach is a centralized approach to


management and storage of the global game state. In
this approach, game provider can completely control
their game with authority (Fan 2009).
Security: the important processing logics are deployed
on the server side in C/MS approach, so it easier to
ensure security and prevent cheating (Kabus et al.
2005).
Simple and mature technology: as we know that the
design and implementation of MMOG are difficult
tasks, but C/MS approach relatively easier to implement and deploy (Mulligan and Patrovsky 2003).

These merits make some developers even think that the


C/MS approach is the only approach to MMOG (Alexander
2005). However, in fact, the C/MS approach ROIA also has
some disadvantages, such as having single point of failure
and performance bottleneck, poor scalability. The biggest
drawback of C/MS approach is that infrastructure investment is large but utilization of these infrastructure is low.

123

Decrease of cost: in P2P approach, the users computer


takes part in computing and storage as a peer. For the
MMOG provider, the overall cost can effectively
reduce, which includes hardware costs, maintenance
costs, network bandwidth costs and other costs.
Well scalability: with the increase of the number of
users, the computing and storage capacity of the entire
system is increased too.
Good fault-tolerant capability: each peer can decide to
join or quit the system freely. System structure can be
self-organizing and even the load can be automatically
balanced. So it has a good fault-tolerant capability.

But the limitations in authentication, security and consistency make P2P approach to be an impractical solution
for large-scale ROIA (Ghosh et al. 2010; Miller and
Crowcroft 2010).
2.1.3 Cloud approach
Cloud approach is relatively new, few researchers have
proposed solutions. Now the main cloud approach is a
technology called Cloud Gaming. The typical representatives of Cloud Gaming are Onlive (2013), Gaika
(2013).
In this approach, all the game logics and graphics are
rendered in the cloud rather than by the client. The cloud
transfers the results by the way of video streaming. The
user just need play the streaming like YouTube and need a
very small thin client to send users game command to the
cloud.
A major attraction of this approach is that it frees
players from the need to frequently upgrade their computers as they can now play games that host on remote
servers. And the scalability and consistency is easy to
achieve as all the game logics rendered in the cloud.
But because which is transferred to client is video
streaming rather than update information in this approach,
the network bandwidth becomes more important. When a
player makes a move, he cannot see the results until the
cloud receives the command, processes the move, renders a
new screen, and delivers the screen to the client. Moreover,
the traditional compensation techniques (Bernier 2001),
such as dead reckoning (Pantel and Wolf 2002), cannot be
used in cloud gaming, because those techniques require
game state information which are not available in cloud
gaming clients.

A fault-tolerant architecture for ROIA

Server
Group 1

Server
Group 2

AoI

Fig. 1 Zoning
Fig. 2 An example of area of interest

Therefore, this approach is more susceptible to the


response latency than other approach. Lee et al. (2012) also
obtained the conclusion that not all games are suitable for
cloud gaming by electromyographic experiments. The
same degree of latency may have very different impact on
a games quality of experience (QoE).
2.2 Zoning
In C/MS approach, multiple servers are used on the server
side to meet the performance requirement. Therefore,
C/MS approach must firstly address the issue of how to
divide server responsibilities, which typically involves
dividing users or other world components among different
servers. Zoning is one of the most widely-used technologies for this.
Zoning (Cai et al. 2002; Assiotis and Tzanov 2006;
Chertov and Fahmy 2006; Nae et al. 2011; Bezerra et al.
2012) based on the principle of data locality partitions the
whole virtual world into several small zones. The shape
and size of each zone does not require to being the same,
and it can be partitioned according to the requirements of
server performance and load balancing.
For example, the game world is partitioned into four
different sizes zones (i.e. A, B, C, D and E zone) in Fig. 1,
each zone supports by the different zone server. The avatars in the same zone perceive each other, and the avatars in
different zones cannot perceive each other directly. The
transition between zones can only happen through certain
portals (e.g., special doors) and requires an important
amount of time. In zoning, the static fixed-size partitioning
approach is simple but may occur with the flocking (Chen
et al. 2005) (i.e. the surge in the number of avatars due to
some emergencies in a zone). The dynamically changing
the size of the zone according to load [e.g., the approach of
Chertov and Fahmy (2006)] can solve these problems.
However, the dynamic partition needs to consider the
problem of data migration and the migration processing of
user actions which greatly increase the complexity.

2.3 Area of interest management


When designing and implementing solutions for ROIA,
software architects are faced with three physical resource
limitations, which need to be balanced in order to deliver a
smooth, coherent online experience (Smed 2002; Carter
et al. 2012): network bandwidth, network latency and
computational power.
Many different techniques have been devised in order to
alleviate the performance impacts caused by these limitations in ROIA development. AoIM (Boulanger et al. 2006;
Albano et al. 2009) is a key technique in relation to network bandwidth compensation.
The principle of AoIM is base on the assumption that
whilst players are all participating in the same ROIA at the
same time, each player does not need full awareness of
every other players current state within the ROIA. AoIM
aims at reducing the number of transmitted messages by
specifying the potentially interested receivers and disseminating player state only to those receivers that have a
current interest in their state.
For example, in Fig. 2, the star, triangle and square
represent the entities (avatars or other objects) in ROIA.
The circular area is on behalf of the AoI of star. The triangles represent the entities which are outside the AoI of
star. And the squares represent the entities which are in the
AoI of star.
After the introduction of the concept of AoI, when the
change in the state information of entities is happened, the
server no more need transfer the state update information
of all the entities to the star, but only need transfer the
information of the entities which are in its AoI (only
squares in the example).
Furthermore, each avatar has its own AoI, and server
send different state update information based on the different AoI of avatar. In this way, it greatly reduces the
amount of information needed to be transferred and saves a
lot of network bandwidth.

123

D. Liu
Active Layer

Location A

ROIA
Server
(RS)

MRCP
Local
Controller
(MLC)

Multi-ROIA
Cloud
Platform
(MRCP)

Shadow Layer

Location B
MLC
RS
Location C
MLC

ROIA User

RS

ROIA User
ROIA User

Fig. 4 The new overlapping architecture


Fig. 3 The architecture of MRCP

3 A new architecture for ROIA in cloud


In this section, firstly, we briefly introduce the structure of
the MRCP, and then propose a new overlapping architecture for ROIA in cloud. At last we abstract its topology
graph and discuss the problem of cell granularity.
3.1 Multi-ROIA cloud platform
We are motivated by the facts that the number of concurrent users around the world is widely and periodically
fluctuant and requirements for network latency in various
ROIA are different. So we proposed the MRCP, a new
approach to achieve high scalability in ROIA under Cloud
environment. Figure 3 depicts the architecture of MRCP.
MRCP consists of multiple data centers distributed
around the world. Each data center has many ROIA servers
(RS) and one MRCP local controller (MLC).
MLC is responsible for users login, storing user information, and the internal load balancing of data center.
Moreover, MLC communication with other MLCs periodically in order to get the load information of other data
centers. If necessary, MLC will execute load balancing
between data centers.
RS is responsible for the functions of ROIA. Various
types of ROIA are deployed on different RS in one data
center. In this paper, we focus on the architecture of a
specific ROIA which is deployed on the RS in MRCP or
other cloud computing environment.
The introduction of cloud computing brings better scalability and other benefits to the ROIA, but also brings
some challenges. In the cloud computing, the node failures
are no longer considered to be accidental events, but to be
normal events. So it needs higher requirements for fault

123

tolerance in ROIA. In view of this problem, we propose a


new overlapping architecture. This architecture provides
better fault tolerance and also better performance.
3.2 An overlapping architecture
We divide the whole virtual world into a lattice of small,
overlapping hexagonal cells. Cell is the basic unit of virtual
world. It cannot be divided into smaller area, but several
cells can be merged into a larger region which be assigned
to one server. Cells can have different geometric shapes.
Due to hexagon has least number of adjacent cells around
each one, we adopt hexagon to cells. Therefore, if each cell
is assigned to one server, then the links between servers is
least by using hexagonal cells.
Figure 4 provides an overview of the new overlapping
architecture. In this architecture, the whole virtual world
covered by overlapping hexagonal cells. The layer consisted of the small white hexagons with solid line is called
active layer. The servers in this layer are in charge of the
logical computing, state information updating and other
functions within their own area.
The layer comprised of the big colored hexagons with
dashed line is called shadow layer. The servers in shadow
layer through synchronization ensure that each entity in
active layer has a redundancy read-only entity. Since in
normal circumstances, there are no some functions that
executed by active layer server, such as logical computing,
so the load of shadow layer server is relatively light.
Shadow layer server can be responsible for a larger area in
the case of equal performance. So in the figure, each colored hexagon area is assigned to one shadow server which
is responsible for the synchronization work with several
cells in active layer.

A fault-tolerant architecture for ROIA

Fig. 5 Topology graph represents the active layer in Fig. 4

In this overlapping architecture, each avatar in ROIA


has two entities, one exists in the active layer server
(named by active entity), and another exists in the shadow
layer server (named by shadow entity). Active entity is the
same as the entity in the traditional ROIA, and the shadow
entity can be seen as a redundancy, read-only copy of
corresponding entity. Only in some special case, such as
server crash, shadow entity acts as an active entity
temporarily.
This cell overlapping architecture serves many important merits: it facilitates entities seamless migration across
cell boundaries, it provides a fault-tolerant architecture to
protect against server crash, and it allows load balancing
methods between servers. These merits will be explained in
detail in Sect. 4.
3.3 Topology graph and granularity
If we abstract the hexagon cells into vertexes and the edge
between the vertexes represents the neighbor relationships
between the corresponding cells, then the active layer in
Fig. 4 can be abstracted into the following graph (Fig. 5).
And the shadow layer also can be abstracted into the
similar graph. The weight of a vertex is the load of the
corresponding cell (e.g., bandwidth, CPU and etc). The
interaction between two cells defines the weight of the edge
connecting the corresponding vertexes.
Generally, several cells merge into a larger region and
then the region is assigned to an appropriate server by the
total load of the region. In other words, the region represents all cells in one server and the cell is the basic unit.
After the above abstraction, regions can be formed by the
following greedy algorithm: starting from the heaviest
vertex in the graph, the vertex connected by heaviest edge
to any of the vertexes already selected is added at each step
until the sum of the vertexes weights reaches a certain

threshold related to the total capacity of the server which


will be responsible for the load of this region.
In the new overlapping architecture, in order to reduce
the complexity of mapping relationship between active
layer cells and shadow layer cells, each layer uses a fixed
granularity of cell size (however, for the cell granularity of
shadow layer is larger than the cell granularity of active
layer, so there are some different size cells in the edge of
the virtual world and the other cells keep the same granularity). And in practical operation, the load of cell can be
migrated between the suitable servers in order to achieve
dynamic load balancing. Therefore, in practical applications, the granularity of cell needs to be moderate, the
excessively large granularity will be unfavorable for
dynamic load balancing, and the too small granularity will
increase the complexity of system.
Dynamic load scheduling algorithms will be in-depth
studied in our future works, and our main objective is to
introduce the new overlapping architecture and its merits in
this paper. Therefore, in order to facilitate understanding
and illustration, the larger cell granularity has been used in
this paper simply.
In addition, shadow layer and active layer have different
cell granularity in the architecture. The cell granularity of
shadow layer is bigger than active layers. There are two
main reasons for this. Firstly, the workload of shadow layer
server is less than the one which responsible for the same
size region in active layer. So the shadow layer server can
be responsible for a larger region on the same hardware
conditions. Secondly, larger shadow layer cell is more
conducive to seamless migration across regions. In the
Sect. 4, we will present it in detail.
Shadow layer and active layer have different cell granularity and an active layer cell may be mapped to two
shadow layer cells. This causes that it needs a smaller
granularity to represent the little block when establishing
the mapping relationship between the shadow layer and
active layer.
For example, active cell A6 is overlapped by two different shadow cell (S1 and S2) in Fig. 6. In the figure, the
id of active layer cell starts with A and the id of shadow
layer cell starts with S.
So the cell A6 needs to be divided into smaller block to
establishing the mapping relationship between two layers.
It can be observed that there are only three cases when
shadow layer cells overlaps on active layer cell, as shown
in Fig. 7a, b, c. In each case, the cell A1 has been divided
into two equal blocks, and each block is mapped to a different shadow layer cell.
We have defined a kind of triple, and use the set of those
triples to represent the mapping relationship between the
layers.

123

D. Liu

in Fig. 6 can be represented as following set. This set is


usually used in the shadow layer server to easily find the
region which the shadow layer server covered.

A2
A1

A3

S1

S1 f A1; 3; 2; A2; 2; 1; A3; 1; 1; A4; 1; 2;


 A5; 0; 0; A6; 3; 1; A7; 2; 2 g;
Meanwhile, the active layer cell A5 and A6 in Fig. 6 can
be represented as following sets.

S2

A5
A6

A4

A5 f S1; 0; 0 g;
A6 f S1; 3; 1; S2; 3; 2 g;

A7

A8

These sets are usually used in the active layer servers to


easily find the shadow cells which overlap the active layer
cells.

Fig. 6 The overlapping relations


(A1, 2, 2)

(A1, 1, 2)

(A1, 3, 2)

4 The merits of the new architecture


A1

A1

A1

In this section, we represent the main merits of the new


overlapping architecture in detail.
(A1, 2, 1)

(A1, 1, 1)
(a)

(A1, 3, 1)
(b)

(c)

Fig. 7 Three dividing cases

Triple Cell Cell ID; Dividing Case; Block ID;

Cell_ID: is the id of the cell. When Cell_ID is an id of


active layer cell, the Triple_Cell will be used in shadow
layer server and represents the situation that the shadow
layer cell has overlapped which active layer cell. When
Cell_ID is an id of shadow layer cell, the Tiple_Cell
will be used in active layer server and represents the
situation that the active layer cell has been overlapped
by which shadow layer cell.
Dividing_Case: is used to indicate the case that the
active layer cell has been divided by shadow layer cell.
Three cases in Fig. 7 are labeled by 1, 2 and 3
respectively. In addition, 0 represents the case that
the cell has not been divided.
Block_ID: is used to indicate the particular block
related to this Triple_Cell after the cell has been
divided into two blocks. We use 1 represent the
block located in the left or below, and use 2
represent the block located in the right or above. In
addition, 0 represents the case that the cell has not
been divided too.

After this definition, each block in Fig. 7 can be conveniently represented by triples (as shown in Fig. 7). The
mapping relationship between the shadow layer and the
active layer also can be easily represented by the sets of
triples. For example, the active layer region covered by S1

123

4.1 Seamless migration across regions


When entities move across cell boundaries, update
responsibilities of entities will be transferred between
servers. The transference may takes some times. So if no
some appropriate techniques to deal with it, users will feel
a noticeable delay which affects the user experience.
In this new architecture, seamless migration across cells
can be achieved by taking advantage of AoI. When the AoI
of entity intersects the boundary of other cell and the
direction of its movement is towards this cell, then it
triggers the transference of entitys related information to
the cell server. It will take some times from the AoI
intersecting the boundary to the entity moving across the
boundary. During this time, the transference of entitys
related information may be completed, so the responsibility
of management shifts quickly.
Figure 8 describes a detail example. In order to facilitate
the labeling, we mark the cells in active layer as the
beginning of A (there are nine cells marked from A1 to
A9 in Fig. 8), and mark the cells in shadow layer as the
beginning of S (from S1 to S3). We mark the entities in
ROIA as the beginning of E (E1, E2 and E3 in Fig. 8).
The circle around the entity represents the AoI of this entity
and the arrow represents the direction of its movement. For
illustrative purposes, it assumes that one server is assigned
to each cell.
In the figure, E1 will move from A5 to A1. The management responsibility of E1 in active layer will be transferred from A5 server to A1server. It may take some times
to create a new entity of E1 on A1 server and to transfer

A fault-tolerant architecture for ROIA

A1
A2

The total cost of this case is the largest one in these three
cases. But from the Fig. 8, it can be observed that the
probability of this case happening is very small.

A3

S1

E1

A4

A7

S3

E2

A5

A6

E3

A8

A9
S2

Fig. 8 Seamless migration across cells

information of E1from A5 server to A1 server. These


works will be triggered when the border of E1s AoI
intersects the boundary of A1. When E1 moves across the
boundary of A1, these works may be completed and
management responsibility will be shift quickly. This will
let users feel no partitions of the whole virtual world.
Due to the overlapping, the entitys movements across
cells can be divided into three cases. The costs of the three
movements are different. Detail analyze as follow.
1.

Active entity crosses boundary but shadow entity does


not cross (such as E1)

In this case, active entity of E1 will move from A5 to


A1, but shadow entity of E1 is still in S1. So it needs to
create a new entity of E1 on A1 server and transfer
information from A5 server to A1 server. And it needs no
more work cost in shadow layer.
Total Cost = Cost of creating entity in active server ? Cost of transferring information between active
servers.
2.

Active entity does not cross boundary but shadow


enity crosses (such as E2)

In this situation, shadow entity of E2 will move from S1


to S3, but active entity of E2 is still in A6. The update
responsibility will be shifted in shadow layer. No shift in
active layer.
Total Cost = Cost of creating entity in shadow server ? Cost of transferring information between shadow
servers.
3.

Active entity and shadow entity all cross boundary


(such as E3)

In this case, active entity of E3 will move from A4 to


A7, meanwhile shadow entity of E3 will move from S1 to
S3. So the management responsibilities in two layers will
both be shifted.
Total Cost = Cost of creating entity in active server and
shadow server ? Cost of transferring information between
active servers and between shadow servers respectively.

4.2 Server crash protection


Because of the redundancy provided by overlapping cells, a
well fault-tolerance is acquired in our new architecture.
The servers in two layers all have the possibility of crash,
so actually there will be a variety of situations. For simplicity, it is assumed that only one server crashes at one
time. There may be the following two categories of server
crash.
4.2.1 Server crash in active layer
For the server in active layer is in charge of the active
entity management, so the crash of active server has a
greater impact. However, because of the new architecture
has the corresponding redundancy read-only copy in the
shadow layer. It can be quickly restored. Specifically, the
server crash in active layer has two situations:
4.2.1.1 Inside server crash Inside server refers to the
active server which is located in the middle of the overlapping shadow server in the virtual world. For example,
A5 server is an inside server in Fig. 8. It assumes that
A5 server crashed. There are several strategies to deal
with it:

S1 server acts as A5 active server temporarily.


Shifting the responsibility of A5 to the lightest load
server selected from the six servers adjacent to A5.

The first strategy has lower cost. Because all entities in


A5 and virtual world information of A5 are all already exist
on the S1 server. Only need to redirect user links of A5
server to S1 server and activate S1 server to an active
server (only for A5, still just a shadow server for other
cells).
The cost of second strategy is larger than the first one.
Suppose A1 server has the lightest load. In addition to
redirect A5 user links to A1 server, A1 server also must
get all entities information and virtual world information of
A5 from S1 server. So it will take more cost.
4.2.1.2 Edge server crash Edge server refers to the
active server which is located in the boundary of two
shadow servers in the virtual world. For example, A8
server is an edge server in Fig. 8. If A8 server crashed,
there are several ways to deal with:

S1 server or S2 server acts as A8 active server


temporarily.

123

D. Liu

The responsibility of A8 is bisected by S1 and S2, each


act as an active server of the part of A8.
Shifting the responsibility of A8 to the lightest load
server selected from the six servers adjacent to A8.

In the third strategy, the cost of migration to the different servers is not the same, but the costs are all larger
than the other strategy.
Each of the first and second strategy has its own merits
and disadvantages. The merit of first strategy is that the
entities of A8 which may have close interactions are still in
one server. But in the second strategy, entities of A8 are
separated and placed on S1or S2, so the cost of interactions
between these entities increase.
The advantage of the second strategy is that there are no
more additional work, such as creating new entity and
transferring related information. Because half of A8 shadow entities are already in S1 server, and the other half are
in S2 server. In contrast, in the first strategy, it will have the
additional work to ensure S1 or S2 server take control of
the whole A8.
4.2.2 Server crash in shadow layer
Due to the actual function of ROIA is executed in the
active servers and the shadow server has only redundant
copy, so the server crash in shadow layer has no marked
impact on the actual application. Moreover, the recovery of
shadow server crash is relatively simple. It just needs to fix
server and synchronize related information from the corresponding active servers.
In fact, the worst case is the corresponding servers of the
two layers are both crashed. In this case, the current information of user will be lost. However, taking advantage of
other servers (login server, character server), the entity state
information can be recovered to a little old status. Of cause,
the probability of this worst case happening is very small.
In summary, since the introduction of the overlapping
layer, the processing of server crash becomes relatively easy.
4.3 Load balancing
In our new architecture, there are two main approaches to
load balancing: dynamic zoning and shadow server
assisting. These approaches can be used alone, and can also
be used in combination.

Merging occurs mainly when the loads of adjacent cell


servers are all light. These adjacent cells can be merged
into one big cell assigned to one server. Other servers can
be used to do other works or shut off.
4.3.2 Shadow server assisting
Dividing cannot solve all the problems. In some case, the
cell is not suitable for dividing into smaller cells. So it can
use shadow server assisting.
Shadow server assisting takes advantage of the redundant information of entities in the shadow servers and shifts
the management responsibility of some entities to shadow
server for reducing the load of active server. This is
somewhat similar to the approach which is taken when
inside server crashed. But here is just shifts part of the load
to shadow server.

5 Conclusions
As an emerging large-scale distribute application, ROIA
has high demands on performance. And a robust and efficient architecture with suitable scalability is essential to
ROIA. Previous works are almost based on the C/S and
P2P mode, and these architectures cannot be simply copied
to cloud computing environment. Therefore, we study the
architecture of ROIA in the cloud computing environment.
We proposed the MRCP, a new approach to achieve
high scalability in ROIA under cloud environment. Based
on these previous works, we propose a new overlapping
architecture with the robust, efficient and scalable characters, which can be deployed on MRCP or other cloud
computing environment.
In this paper, we introduce related works and some
techniques used in our new architecture. And then we
present our new overlapping architecture. We focus on
analyze the three merits of new architecture: seamless
migration across cells, protecting against server crash and
dynamic load balancing. We analyze the various processing strategies and the corresponding costs. This paper sheds
a little light to the related further research on ROIA.
Acknowledgments This work is supported by Guangzhou Science
and Technology Project (2010Y0-C681); Guangdong Science and
Technology Project (2010B060100056); Natural Science Foundation
of Guangdong (S2012010008831).

4.3.1 Dynamic zoning


References
The dynamic zoning refers to system dynamically expand
or shrink the size of cell according to the load of this cell
server. Dynamic zoning is the basic approach to load balancing in ROIA.

123

Albano M, Quartulli A, Ricci L et al (2009) AoI cast by tolerance


based compass routing in distributed virtual environment.
Proceedings of the 8th annual workshop on network and systems
support for games, Paris

A fault-tolerant architecture for ROIA


Alexander T (2005) Massively multiplayer game development,
Charles River Media, ISBN: 1584503904, 2nd edn.
Assiotis M, Tzanov V (2006) A distributed architecture for
MMORPG. Proceedings of 5th ACM SIGCOMM workshop on
network and system support for games
Bernier YW (2001) Latency compensating methods in client/server
in-game protocol design and optimization. Proceedings of 15th
game developers conference (GDC)
Bezerra CEB, Comba JLD, Geyer CFR (2012) Adaptive loadbalancing for MMOG servers using KD-trees, ACM computers
in entertainment 10
Boulanger JS, Kienzle J, Verbrugge C (2006) Comparing interest
management algorithms for massively multiplayer games. Proceedings of 5th ACM SIGCOMM workshop on network and
systems support for games, Singapore
Cai W, Xavier P, Turner SJ et al (2002) A scalable architecture for
supporting interactive games on the internet, Proceedings of 16th
workshop parallel and distributed simulation, pp 6067
Carter C, El Rhalibi A, Merabti M (2012) A survey of AoIM,
distribution and communication in peer-to-peer online game,
Proceedings of international conference on computer communications and networks, pp 15
Chen J, Wu B, Delap M et al (2005) Locality aware dynamic load
management for the massively multiplayer games. In: Proceedings of the 10th ACM SIGPLAN symposium on principles and
practice of parallel programming, pp 289300
Chertov R, Fahmy S (2006) Optimistic load balancing in a distributed
virtual environment, In Proceedings of the international workshop on network and operating systems support for digital audio
and video, pp 16
Fan L (2009) Solving key design issues for massively multiplayer
online games on peer-to-peer arechitetures, Doctor Dissertation
of Heriot-Watt University
Gaika (2013) A Sony computer entertainment company. http://www.
gaikai.com. Accessed 19 Nov 2013
Ghosh S, Wiegand T, Goldiez B et al (2010) An Architecture
Supporting Large Scale MMOGs. Proceedings of the 3rd
international ICST conference on simulation tools and techniques
Glinka F, Raed A, Gorlatch S et al (2010) A service-oriented interface
for highly interactive distributed application. Proceedings EuroPar 2009 workshops, LNCS 6043, pp 266277

Gorlatch S, Meilaender D, Ploss A et al (2012) Towards bringing realtime online applications on cloud. Proceedings of international
conference on computing, networking and communications
(ICNC), pp 5761
Jia L, Huiyou C et al (2010) Resource assignment algorithm under
multi-agent for P2P MMOG. J Comp Res Dev 47(12):
20672074
Kabus D, Terpstra WW, Cilia M et al (2005) Addressing cheating in
distributed MMOGs. Proceedings of 4th ACM SIGCOMM
workshop on network and system support for games, pp 16
Liu D, Zhao YL (2013) A new approach to scalable ROIA in cloud.
In: Proceedings of the 4th emerging intelligent data and web
technologies (EIDWT), pp 5155
Ling D, Xiang-bin SHI et al (2012) Scalable P2P overlay architecture
for MMOG. J Chin Comput Syst 33(1):5863
Lee YT, Chen KT, Su HI et al (2012) Are all games equally cloudgaming-friendly? An electromyographic approac. In: Proceedings of the 11th ACM SIGCOMM workshop on network and
system support for games
Miller JL, Crowcroft J (2010) The near-term feasibility of P2P
MMOGs. Proceedings of 9th ACM SIGCOMM workshop on
network and system support for games
Mulligan JM, Patrovsky B (2003) Developing online gamesan
insider guide, New Riders Publishing, ISBN:1592730000
Nae V, Iosup A et al (2011) Dynamic resource provisioning in
massively multiplayer online games. IEEE Trans Parallel Distrib
Syst 22(3):380395
Onlive (2013) A Cloud game company. http://www.onlive.com.
Accessed 19 Nov 2013
Pantel L, Wolf LC (2002) On the suitability of dead reckoning
schemes for games In: Proceedings of ACM SIGCOMM
workshop on network and system support for games
Smed H (2002) Aspects of networking in multiplayer computer
games. Electron Library 20(2):2002
Wang G, Wang K (2012) An efficient hybrid P2P MMOG cloud
architecture for dynamic load management, Proceedings of 2012
international conference on information networking, pp 199204

123

You might also like