You are on page 1of 6

A Collaborative Cloud-Based Multimedia Sharing

Platform for Social Networking Environments


Cristian Gadea, Bogdan Solomon, Bogdan Ionescu, Dan Ionescu
NCCT Lab, University of Ottawa, Ottawa, Canada
{cgadea, bsolomon, bogdan, dan}@ncct.uottawa.ca
AbstractThe amount of multimedia content on the internet
has been growing at a remarkable rate, and users are increasingly
looking to share online media with colleagues and friends on
social networks. Several commercial and academic solutions have
attempted to make it easier to share this large variety of online
content with others, but they are generally limited to sending
links. Existing products have not been able to provide a scalable
cloud-based system that synchronizes disparate web content
among many users in real-time. Additionally, they have lacked
a platform with a modular architecture that can be extended
by developers to support new sources of online media. In this
paper, a cloud-based software architecture for a multimedia
collaboration platform is introduced. The platform is accessible
from a typical web browser and allows users to collaborate over
webcam chat while viewing videos, photos, maps, documents,
and listening to music, all in real-time. As examples, it is shown
how a distributed system called Watch Together was deployed
to real users within Facebook and an e-learning environment.
Usage data is provided from both deployments and observations
are made on how users share and consume real-time multimedia
content.
Index Termsmultimedia in social networking environments,
cloud-based digital content delivery, real-time web collaboration,
internet human-computer interaction, online multimedia sharing
I. INTRODUCTION
Web 2.0 has dramatically transformed the way in which
information is collected and presented to online users, and
the enormous popularity of social networking has created a
growing appetite for online multimedia content. Rather than
just viewing simple HTTP pages, users now expect the ability
to share and collaborate with other people, such as friends
or colleagues, online and in real-time. This can be seen with
services like Google Docs [1], where multiple users can work
on the same document at the same time, and the document is
stored on Googles remote servers in the cloud. Other popular
websites which offer the ability for online content sharing
between users include Facebook and Twitter, yet in both cases,
a message posted by a user containing links to photos or videos
is later viewed by one or more users separately. In many ways,
this is in no way different than sending an e-mail with either
links or attachments.
The system presented in this paper aims to achieve real-time
collaboration between users who share a collaborative session
as a group. Although several commercial, open source and
academic web-based collaboration solutions have existed for
some time [2], the solution presented in this paper requires
that all users in the session see the exact same state of
the system - be it the same video at the same moment in
time, the same image or the same page in a document or
slide. Actions performed by a user (such as changing the
image, fast forwarding in the video, changing the page) are
replicated across all the users in order to ensure that the
same state is maintained. Additionally, text and video/audio
chat are integrated in the system, allowing users to see and
hear each other as they collaborate over the online multimedia
content. As is increasingly important for online services, the
system must make use of a cloud-based architecture so that it
continues to perform reliably as the popularity of the service
grows.
To support a large variety of online multimedia sources,
the system has to be a platform on top of which additional
synchronized applications can be developed and deployed.
As such, the platform can be extended to support the latest
popular online services and make their content collaborative
with relative ease. The system must therefore provide easy-
to-use APIs for developers. In addition, the synchronization
between users must be handled in a transparent way such
that developers do not have to worry about the necessary
synchronization messages reaching all users within a session.
Finally, the system must easily integrate with existing social
networking environments. As such, it must make use of the
latest web-based technologies and APIs. In order to be as
accessible as possible for users, the system must not require
the download and installation of proprietary browser plugins.
This is achieved by using the Flex 4 framework for Adobe
Flash [3] on the client side, along with a Real Time Messaging
Protocol (RTMP) based [4] server. While Adobe Flash is
a browser plugin, it is currently more widely adopted than
HTML5, which is still in experimental states in all major web
browsers.
Yahoo! Zync was one of the rst products with the ability
to view synchronized videos [5]. As an add-on to the Yahoo!
Messenger client, Zync would detect when a participant of a
two-user IM conversation pasted a link to a YouTube video,
which would cause a synchronized video player window to
appear. The creators found that the synchronicity and social
co-presence would promote online conversation, engagement
with shared media, with 31% of users returning to reuse the
service after their rst session. To use Zync, users would need
to download and install the Yahoo! Messenger application,
as well as the Zync addon, and the system was limited to
Microsoft Windows.
978-1-4577-0638-7 /11/$26.00 2011 IEEE
In the system proposed in this paper, users are able to
collaborate on multimedia content while having the content
rendered on each users local machine within their own
instance of that application. Unlike common remote access
solutions [6], content in the system presented in this paper
is not re-encoded as part of a remote screen update. Videos
therefore run at full speed for all users and synchronization is
ensured through event-based signals.
The organization of the remainder of this paper is as
follows: Section II discusses the requirements and architecture
of the proposed collaboration system. Section III describes the
implementation details of the system and its API. Results from
the implemented system and usage data from two separate
live deployments are presented in section IV. Section V then
reects on this papers contributions and proposes topics for
future research.
II. REQUIREMENTS AND ARCHITECTURE
This section offers a look at the overall design consid-
erations of the architecture. The system presented in this
paper must achieve real-time collaboration between the dif-
ferent clients within the same session, while at the same
time be extendable and scalable across multiple servers. The
requirements for this architecture are rst established, and the
architecture is then developed.
A. Requirements
A number of functional software requirements result from
the real-time browser-based collaborative nature of the system
presented in this paper:
1) Clients in the same session must see the exact same
thing in the collaborative part of the application.
2) Clients must be able to communicate with each other
through text, audio and video.
3) Clients must be able to search through numerous data
sources for multimedia content.
4) Clients must be able to invite one or more of their
contacts to a session.
5) Clients must be able to accept or reject a session
invitation received from another user.
6) Clients which join a session after it has started must be
synchronized to the state of the session.
In addition, the following non-functional requirements were
determined:
1) Clients must be able to access the system via a web
browser without the need for downloading extra propri-
etary addons or plugins.
2) The system must scale by supporting the deployment of
new media servers in different locations.
3) New application types and data sources are loaded on
demand by clients.
4) The system must be deployable within different social
networking environments and make use of the environ-
ments APIs to retrieve user information.
Facebook Server
Media Server
Client 1
1) login
3) getSWF
2) authenticate user
4)get user info
5) connect
Client 2
6) communicate
with Client 1
Fig. 1. High-Level Server-Client Architecture.
UserStateService SessionService WebcamVideoStreamService
MediaServer
Fig. 2. Server Side Services.
5) Depending on the deployment type, the system must
allow different data sources and application types to be
loaded.
B. Architecture
In order to achieve the functional and non-functional re-
quirements of the system, the architecture was developed as a
client-server application as shown in Figure 1. A web server
component is required to provide the HTML page and the
embedded ShockWave Flash (SWF) les necessary for the
client web browser to run the application. Once the client is
authorized and the SWF les are loaded on the client side,
the client connects via RTMP to the Media Server. After a
connection to the Media Server is established, the client can
start collaborating with other users.
1) Server Architecture: Figure 2 shows how the server side
consists of a Media Server that provides three key services:
UserStateService - This service is used by the client to
indicate changes in user presence (coming online, going
ofine, currently busy) or to retrieve the list of connected
users.
WebcamVideoStreamService - This service is called by the
client to indicate that the user has enabled (or disabled)
their webcam. The service also allows users to send
their video/audio data to the Media Server, as well as
to connect to existing streams from other users.
SessionService - This service is used to invite others to
a collaborative session, signal an acceptance/rejection of
an invitation, obtain a list of other users in an existing
session, as well as to send all application-specic syn-
chronization messages.
In order to provide a scalable cloud-based architecture, a
JGroups Gossip Server can be used to perform the role of a
group manager as in Figure 3. As new Media Servers come
Client
Access Control Server
Media Server 1
Media Server 2
2.1) getServer
2.2) server location
2.3) connect
JGroups Gossip
Server
2.4) broadcast
new client
2.5) broadcast
new client
1.1) broadcast
new server
1.2) new server
1.3) new server
1.4) new sever
1.4) new server
Fig. 3. Cloud-Based Server-Client Architecture.
GroupManager GroupClient GroupMessageReceiver
MediaServer
ServerPeers
ServerPeers
ServerPeers
1
Fig. 4. Cloud-Based Server Side Features.
online, they signal their availability to the JGroups Gossip
Server, thus adding themselves to the group. The JGroups
Gossip Server then noties the Access Control Server of the
presence of the new Media Server. When a second Media
Server comes online, the Access Control Server, as well as
the rst Media Server, are notied. The client rst contacts
the Access Control Server to obtain the location of the Media
Server closest to the client. The client then connects to this
Media Server, and the Media Server informs the JGroups
Gossip Server of the new client. The JGroups Gossip Server
then informs the other known Media Servers of this update.
A Media Server must identify a list of users who are
relevant; that is, users who are currently connected to the
Media Server or are contacts of the connected users. Contacts
are important to track since any updates to their status (online,
ofine, busy, etc.) must reach the connected users, no matter
which Media Server they were assigned. These relevant users
should be identied as GroupClients. A GroupManager should
be used to track users on external servers (ServerPeers) and
transmit messages to them through the JGroups Gossip Server.
A GroupMessageReceiver can be used to receive incoming
messages. This is summarized in Figure 4.
2) Client Architecture: The client side has to be easily
extendable and be capable to load its components on demand
based on the integration within a social network. This requires
the architecture of the client to be modular. The client side
modules are split into Domain Modules and Media Mod-
ules.
Domain Modules are used for dealing with domain-
specic data. A domain represents a deployment instance
of the system and each domain has a separate login approach,
which varies for each social networking environment. There
are two Domain Modules developed for each domain:
A Login Module Login Module performs the login logic
and retrieves the local users information and contacts.
A User List Module displays the users contacts within
the user interface. The reason for using different User
List Modules is that different domains can have different
contact categorizations. For example, Facebook contacts
are called friends; a user has a number of friends and
there are no subcategories. If the system is deployed
within an organizations social network, however, the
organization may dene groups of users for various tasks.
In order to maintain a consistent Look and Feel while
making allowance for such functionality differences, all
User List Modules extend an existing component.
Media Modules are used for developing collaborative
applications and dealing with their data sources. There are four
Media Module types: a Search Module, a Viewer Module, a
Control Module, and an Information Module:
The Search Module allows the user to search for a specic
media item to share with other people. An external
API of the media or data source is typically used to
retrieve the search results. For example, for a YouTube
application, the search component mimics the standard
YouTube search options (search videos from Today,
This Week etc.) and displays a list of thumbnails for
the videos.
The Viewer Module displays the actual media content
selected from the list provided by the Search Module. For
YouTube, this is the actual streaming video. The viewer
supports a maximized full-screen mode and is responsible
for resizing the media content.
The User Control Module displays the controls that the
user requires to interact with the media content. Actions
triggered via this component are typically synchronized
with other users, although this is not always the case.
For YouTube videos, the User Control Module contains
play/pause, volume and video timeline seeking controls.
The volume is not synchronized across the session, as
various users might prefer different volume settings.
The Information Module displays useful information re-
lated to the currently selected media content. In the case
of YouTube, this is simply the title of the video.
Figure 5 shows the activity diagram of the client. Initially,
the clients browser loads the SWF le from the server. Once
the main application is loaded, it determines the domain
under which it is deployed. Based on the domain, it loads
a conguration le from the server. This XML conguration
le describes the Domain Modules to load, as well as the
Media Modules which are available to the application. The
client code then loads the Login Module and the User List
Module, and performs a login for the user. As part of the
login process, the users contacts are loaded. Next, the client
MainApp Loaded/
Determine Domain
loadSWF
Config Loaded loadConfig
Load Login
Module
Load User List
Module
Load Media
Module
Connect to Media
Server
user loads media
Fig. 5. Client Activity Diagram.
opens the connection to the Media Server and the Search
Modules become available. Finally, if the user selects a specic
media item to view, the corresponding Viewer Module and
User Control Module is loaded from the server.
client1:Client Server:MediaServer client2:Client
connect
onConnect
onConnectAccept
connect.success
notifyIsOnline
userIsOnline (client1)
notifyIsOnline (client1)
sessionSetup
disconnect
onDisconnect
userIsOffline (client2)
Fig. 6. Connection Message Sequence Chart.
3) Server Client Communication Architecture: Due to the
modular and extendable nature of the client, the communica-
tion between the client and the server must be able to support
messages that were not considered at design time. Figure 6
shows the sequence diagram for establishing the connection
between the client and the server, and then disconnecting.
The diagram assumes that there is already a client connected
to the server (namely, client2). The sequence is initiated
when a different client, client1, connects to the server. Upon
connection, the Media Server performs access control and
determines if the user should be allowed to connect. If the
user is allowed to connect, a connect.success message is sent
back to the users client (client1). The client then sends a
notifyIsOnline message, which contains a list of the unique
IDs of the clients contacts. Assuming client2 is a contact
of client1, the server posts messages to both client1 and
client2 that the other client is online. Following a collaborative
WTServer
+ sendMessage()
- localClient
- remoteClient
- clientID
GroupClient
+ sendMessage(RemoteServer, Message)
- serverPeers
GroupManager
JChannel
+ receive(message)
GroupReceiver
+ receive(message)
- host
- port
- application
- joined
RemoteServer
- sessionController:Client
ServerSession
1
*
0..1
*
*
Fig. 7. Server Class Structure.
session, client1 disconnects from the Media Server. The server
then noties client2 that client1 has gone ofine.
III. IMPLEMENTATION
In order to implement the described architecture, Adobe
Flex was used for the client side and Red5 [7] was used as the
Media Server. The reason for using Red5 for the server is that
it is an open source implementation of the RTMP specication.
In order to achieve the desired modularity of the system, Flex
Modules are used. Flex Modules make it possible for new
modules to be downloaded at run time as they are required.
Figure 7 shows the server-side implementation that was
used. The WTServer class handles client connections, webcam
streams and session messages by using other classes like
ServerSession to provide the services previously introduced
in Figure 2. The server has zero or more clients connected at
any time and has zero or more sessions running at any time.
Clients can connect to the server and send messages to other
clients. WTServer tracks clients using the GroupClient class
In order for a Media Server to track updates for only the
relevant users, WTServer creates GroupClient objects for all
connected users and their contacts. One of either localClient
or remoteClient properties is set to true to identify the type
of GroupClient object. If the object is a remoteClient, a Re-
moteServer object is associated to identify the external Media
Server on which the client can be found. To transmit messages
to other users, the sendMessages() method is used, indifferent
of which server the other users are on. For messages that must
reach users on remote servers, the GroupManager is used,
which contains a list of serverPeers and a link to a JChannel
object so that the JGroups Gossip Server can ensure the
message gets delivered to the correct server. A GroupReceiver
is used for managing responses received through the JChannel.
On the client side, each Media Module implements the
MVC software pattern. In order to allow for the extensibility
of the platform, each of the controllers for the modules must
implement one of the provided interfaces. This is done in order
to ensure that the modules, which are loaded at runtime, can
communicate with each other. The model, which is common
between all four modules, is dened for each media type by
the developer as it is media-dependent. The viewer for each of
the modules is also media-dependent, and as such is dened
by the developer.
+ initComplete()
+ command()
+ setSize()
+ sync()
+ getSyncState()
<<interface>>
ViewerController
- MediaCommandQueue()
+ getInstance()
+ addCommandToQueue()
+ playbackCommands()
- queue:MediaCommandQueue
MediaCommandQueue
+ search()
+ loadMedia()
<<interface>>
SearchController
+ sendCommand()
+ maximizeMinimize()
<<interface>>
UserCommandController
+ getCommand()
+ setCommand()
+ getData()
+ setData()
+ setDescription()
- command
- data
- description
MediaCommand
+ setDescription()
<<interface>>
InformationController
*
1
Fig. 8. Client Interfaces.
As can be seen in Figure 8, the ViewerController, User-
CommandController, InformationController and SearchCon-
troller make use of the MediaCommandQueue to perform
their necessary functions. MediaCommandQueue is a singleton
that stores commands until the Viewer Module is loaded and
then play the commands in the order received. Otherwise,
if a new user joins a session, receives a synchronization
command, and new commands are received while the user is
still loading the correct Viewer Module, the newer commands
would be lost, which would lead to the desynchronization of
the sessions. A second role for the MediaCommandQueue
is that of ensuring that the correct modules are loaded.
When a new command is received, the MediaCommandQueue
determines if the currently loaded modules can perform the
command. If they can not, then the correct modules are
loaded, the queue is emptied (since commands affecting the
loaded modules no longer apply while the new modules are
loaded) and the command is added to the queue. If, while
modules are being loaded, a new command comes which
requires different modules than the ones being loaded, then
similarly the queue is emptied and the command is added
to the queue. The MediaCommandQueue is also responsible
for broadcasting messages to other members of the session
through the connection to the server. To store a command, the
MediaCommandQueue uses a MediaCommand data structure
which has three elds: command, which represents a unique
ID for the command (for example loadVideo, play, pause, seek
for video media content); data, which stores the data for the
command (for example the new time position for the video
seek command); and description, which holds the description
of the media type used by the Information Module.
Through the use of these MediaAPI interfaces, developers
can easily add new applications and data sources to the
Collaborative Web Client Platform without needing to worry
about the synchronization mechanism. The modular approach
also allows different Media Modules to be distributed across
different servers, and the system, through a simple congura-
tion le, can nd and load them. Currently, applications have
already been developed to support YouTube videos, Flickr
and Facebook images, Twitter text messages, local documents
(where users can upload documents to the system and share
them), and live videos from UStream.
IV. RESULTS
The implemented system was called Watch Together to
highlight its collaborative nature. Two variations of Watch
Together were deployed to different groups of live users and
their usage of the deployments was observed.
A. Facebook Deployment
The system was deployed to the public as a Facebook
Application [8]. Users can access the system by logging in
with their Facebook account and adding Watch Together to
their application bookmarks list. The Facebook Developer API
[9] was used to retrieve the information about the user who is
currently logged in (such as their name, list of friends, prole
image, etc.) and to populate the user interface.
Figure 9 shows a collaborative session containing six users,
where three of the users have enabled their webcam. The
viewer module described in the architecture section appears
in the top half of Watch Together and is always synchronized
between the users in the session. The users currently in the
session are shown along the bottom of the interface using
either their prole images (retrieved via the Facebook API) or
a live video stream from the users webcam. Near the middle
of the interface is a menu bar with clickable icons. The left
side of the menu bar contains the list of applications developed
for the Watch Together platform (by using the API described
in section III) to support various sources of popular online
multimedia content. Clicking one of these icons brings up
a search module containing thumbnails of the search results.
The thumbnails can be clicked on to change what is displayed
within the viewer module. The right side of the menu contains
icons for the contacts module (which shows a list of available
online users that can be invited to a session), text chat module,
and the settings module, all of which appear on top of the
viewer module but do not affect the synchronized content.
Based on over a thousand users who opted in to share
annonymized usage data, 65% of users of Watch Together were
found to be male and 35% were female. Surprisingly, 24%
of users enabled their webcam when using Watch Together,
which is a strong indicator that users enjoy sharing and
discussing online content in this collaborative fashion. Users
were particularly drawn to the YouTube application for sharing
their favourite video clips, with an average of 11.4 videos
Fig. 9. Six Facebook Users Collaboratively Watching a YouTube Video.
Age Group Percentage of Users (%)
< 18 2
18-21 11
22-25 54
26-29 28
30-33 4
> 33 1
TABLE I
WATCH TOGETHER USERS BY AGE GROUP.
viewed per user. The distribution of users by age can be seen
in Table I, with most in the 22-25 age group.
In this deployment, the cloud-based architecture ensured
that the users were automatically distributed among two differ-
ent Watch Together Media Servers, and the webcam streams,
as well as all synchronization messages, performed very well
during testing. A common request from the test users was
to implement the ability to modify the volume of the audio
coming from each users video chat stream. A volume slider
and mute button was therefore added over each users video
stream that appeared whenever the user moved their mouse
over the video chat area.
B. E-Learning Deployment
A second version of Watch Together was customized as
a module for the Moodle e-learning software platform [10].
Moodle is an open source course management system used
by Prof. Ionescu for his classes at the University of Ottawa.
Each student is provided with an account in the system that
allows them to upload assignments, check their grades, etc.
The exibility of the design allowed Watch Together to be
integrated using the Moodle module API such that students can
collaborate with each other and with the professor over course
material. While the system is mostly used for its document
sharing feature during the professors online ofce hours,
YouTube videos and other content related to the class are also
made available for collaboration. The use of Watch Together in
this way reveals its potential beyond entertainment and more
towards enterprise-oriented social networking scenarios.
Fig. 10. Students Collaborating Over Slides Within the Moodle Platform.
V. CONCLUSION
This paper presented the design and implementation of
a cloud-based collaboration platform for experiencing syn-
chronized online media from a web browser. It was shown
how users can collaborate over video chat while viewing
videos, photos, maps, documents and more in real-time. This
differs from existing collaboration solutions which may require
cumbersome installations and lack a scalable design. As a
platform, developers can easily add new media sources to
the system so that all popular digital media can be made
available for instant sharing. Additionally, the systems ability
to be integrated into social networking environments such as
Facebook and Moodle was demonstrated. It was observed
that users enjoy collaborating on online media in real-time
with respect to one another. This feedback allows for further
improvements to be made to all aspects of the system and
for the additon of features such as real-time document editing
and games. Future papers will also focus on more experimental
results for the scalability of the cloud-based backend.
REFERENCES
[1] (2011) Google Docs - Online Documents, Spreadsheets, Presentations.
Google Inc. [Accessed: March 2011]. [Online]. Available:
http://docs.google.com/
[2] W. Wang, Powermeeting: GWT-Based Synchronous Groupware, in HT
08: Proc. of 19th ACM Conf. on Hypertext and Hypermedia. New
York, NY, USA: ACM, 2008, pp. 251252.
[3] (2011) Flex Open-Source Framework. Adobe Systems Inc. [Accessed:
March 2011]. [Online]. Available: http://www.adobe.com/products/ex/
[4] (2011) Real-Time Messaging Protocol (RTMP) Specication.
Adobe Systems Inc. [Accessed: March 2011]. [Online]. Available:
http://www.adobe.com/devnet/rtmp.html
[5] Y. Liu, P. Shafton, D. A. Shamma, and J. Yang, Zync: The Design
of Synchronized Video Sharing, in DUX 07: Proc. of 2007 Conf. on
Designing for User eXperiences. New York, NY, USA: ACM, 2007,
pp. 18.
[6] M. R. Thissen, J. M. Page, M. C. Bharathi, and T. L. Austin, Communi-
cation Tools for Distributed Software Development Teams, in SIGMIS-
CPR 07: Proc. of ACM SIGMIS CPR Conf. on Computer Personnel
Research. New York, NY, USA: ACM, 2007, pp. 2835.
[7] (2011) Red5. The Red5 Project. [Accessed: March 2011]. [Online].
Available: http://red5.org/
[8] (2009, June) Watch Together. [Accessed: March 2011]. [Online].
Available: http://www.watch-together.com/
[9] (2011) Facebook Developers. Facebook Inc. [Accessed: March 2011].
[Online]. Available: http://developers.facebook.com/
[10] (2011) Moodle.org: Open-Source Community-Based Tools for
Learning. Moodle Trust. [Accessed: March 2011]. [Online]. Available:
http://www.moodle.org/

You might also like