You are on page 1of 6

Component Based Video Communication Tool for Collaborative Virtual Environment

Hirotatsu Sakamoto Yoshihiro Okada Toshihiko Shimokawa Kazuo Ushijima Graduate School of Information Science and Electrical Engineering, Kyushu University 6- 10-1 Hakozaki, Higashi-ku, Fukuoka, 8 12-8581, Japan Phone: +8 1-92-642-3872, Fax: +8 1-92-583-7632
hirotatu@swlab.csce.kyushu-u.ac.jp

okada@i.kyushu-u.ac.jp

toshi@csce.kyushu-u.ac.jp

ushijima@csce.kyushu-u.ac.jp

Abstract
This paper treats a component based video communication toolfor collaborative virtual environments. Especially the authors propose a new concept and its realization mechanismsfor easy construction of distributed 3 0 graphics applications using video communications, e.g., network meeting, conference, and training in a 3 0 virtual space. I f a software component is represented as a visible, manually operable object, users can make its copy and transfer it to another computer easily and rapidly. I f a facility that manages video/audio data is realized as such a object, even end-users can easily and rapidly build networked video communication environments through the copy-andtransfer operation. To clarify an availability of this concept, the authors employ IntelligentBox system as a research plagorm. The authors introduced a video communication facility as a software component into the IntelligentBox system. IntelligentBox has provided a network communication facili5 as a software component. Using these components, it will be possible to build collaborative virtual environments, which support video communications, through copy-andtransfer operations.

1 Introduction
At present, advances of information network technologies made it possible to use broadband networks. Then, in the near future, high quality multimedia communications will be common and inevitable to use. Many researches on multimedia communications have already been done. Most of them focused on the development of the multimedia communication systems such as network meeting systems, teleconference systems, and network lecture systems[2][3][5][8]. Some of them focused on multimedia communications in a 3D virtual environment[ l]. Distributed 3D virtual environments are sometimes called

collaborative virtual environments or distributed virtual environments. Many researches on collaborative virtual environments have been done so far because advances of computer hardware technologies made it possible to generate realistic 3D images in real-time. Then we are also interested in collaborative virtual environments. Especially we are currently focusing on multimedia communications in a collaborative virtual environment since videolaudio supports are necessary for smooth communications in various applications. Our research purpose is to establish the essential mechanisms that enable even end-users to develop collaborative virtual environments. In this paper, we propose a new concept called copy-and-transfer operation and its realization mechanisms for easy construction of collaborative virtual environments that support video communications. If a software component is represented as a visible, manually operable object, users can make its copy and transfer it to another computer easily and rapidly. This copy-andtransfer operation is similar to the copy-and-pastelcutand-paste operations, which are standard GUI operations based on using a mouse-device. If a facility that manages videolaudio data is realized as such a visible, manually operable object, even end-users would be able to build networked video communication environments only through the copy-and-transfer operation. To clarify availability of the copy-and-transfer operation, we employ IntelligentBox system[6] as a research platform. IntelligentBox is a component ware for developing interactive 3D graphics applications. IntefZigentBox provides various software components as visible and manually operable 3D objects called boxes. IntelligentBox also provides a dynamic data linkage mechanism called slot-connections to enable users to construct 3D graphics applications by combining existing boxes through direct manipulations on a computer screen. IntelligentBox has already provided video managing facilities as boxes called MovieBox and VideoBox. However

0-7695-0951-7/01 $10.00 0 2001 IEEE

375

these boxes were not available through network. Then we improved them in order to support video communications via network. IntelligentBox has also provided a network communication facility as a box called RoomBox [7]. Multiple RoomBoxes on different computers share specific users-operation events with each other. Those RoomBoxes virtually provide multiple users with a shared 3D space. It is possible to import any already developed boxes into the shared 3D space without any modifications. Therefore, it is possible to construct a distributed 3D virtual environment by making a copy of one RoomBox existing on a certain computer and transferring it to another computer. Using these boxes, i.e., RoomBox, MovieBox, and VideoBox, it will be possible to build collaborative virtual environments, which support video communications, only through copy-and-transfer operations. In this process, end-users will not have to write any text-based programs. In this paper, we clarify availability of this concept of copy-and-transfer operation by discussing development costs and performances with showing possible, practical application examples.

2.1 MD (Model-Display object) structure


As shown in Figure I , each box consists of two objects, a model and a display object. This structure is called an M D (Model-Display object) structure. A model holds state values of a box. They are stored in variables called slots. A display object defines how the box appears on a computer screen and defines how the box reacts to user operations. Figure 1 also shows messages between the display object and, the model. This is an example of a RotationBox. Through direct manipulations on a box, its associated slot value changes. Furthermore, its visual image simultaneously changes according to the changes of the slot value. Then a box reacts to the user manipulations according to its function.

2.2 Message-sending protocol for slot connections


Figure 2 illustrates a data linkage concept between boxes. Each box has multiple slots. Its one slot can be connected to one of the slots of other box. This connection is called a slot connection. The slot connection is carried out by three messages when a parent-child relationship exists between two boxes. Three messages are a set message, a gimme message and an update message. A (value) in the format @ represents any value, and a (slotname) in the formats @ and @ represents a userselected slot of the parent box that receives these two messages. A set message writes a child box slot value into its parent box slot. A gimme message reads a parent box slot value and sets it into its child box slot. Update messages are issued from a parent box to all of its child boxes to tell them that the parent box slot value has changed. In this way, these three messages connect a child box slot and a parent box slot, and combine their two functionalities.

[Related works]
One of the collaborative virtual environment systems using multimedia communications is a teleconference system. Very popular products are Cornell Universitys CU-SeeMe and Microsoft NetMeetingTM. Famous free software are vic and vat[ 1 I]. Many teleconference systems besides them have been made so far. Honda, et a1.[4] proposed a virtual office based on a shared room concept. Yoshida, et al.[ 101 proposed a virtual space teleconference system. This system supports intuitive interaction for creative and cooperative work. Although these are very powerful systems, it is not easy to use their essential mechanisms when developing distributed 3D graphics applications. Our research purpose is to clarify software architecture that makes it easier to develop distributed virtual reality applications. This is major difference between our research and others. The remainder of this paper is organized as follows: Section 2 explains essential mechanisms of IntelligentBox and RoomBox. In section 3, we explain realization mechanisms of MovieBox and VideoBox. Section 4 discusses development costs and performances with showing possible, practical application examples. Finally, we conclude this paper in section 5 .

2.3

A shared-copy and a distributed modelsharing

2 Essential mechanisms of ZnteZZigentBox


IntelligentBox employs the following essential mechanisms inherited from IntelligentPad [9], which is a 2D synthetic media system since IntelligentBox is an extension of IntelligentPad to 3D graphics applications.

As previously mentioned, each box consists of two objects, i.e., a model and a display object. The MD structure allows more than one box to share the same common model. This mechanism is called model-sharing and the operation that generates a copy of a display object sharing a common model is called shared-copy. A box generated by shared-copy shares all slot values. After one of model-shared boxes is transferred to other host via network, new corresponding box is generated in that host and the box has the same slot values and keep them always by messages via network to conserve consistency of slot values. This makes it possible to realize model-sharing via network.

376

0
-1
I

3D waudve ol$'ecls
(my Bores)

Computer A

0 &PIC

Figure 1 An MD structure of a : box and its internal messages

M&l of K " B m

Model of RoOmBm

Figure 2: Standard messages between boxes Figure 3: Message flow between two RoomBoxes for network collaboration graphics applications have been made so far. If video communication facilities become available in the IntelligentBox system, these applications become more and more practical than now. Originally video managing facilities, i.e., MovieBox and VideoBox, have already existed. MovieBox has a function to play a movie file, and VideoBox has a function to get each image from a video camera on live and to display it on a computer screen. However both boxes could not support video communications. Then we extended their functions to be applicable to various network applications that need video communications.

2.4 RoomBox for collaborative virtual environments


This section briefly describes an idea of a shared 3D space and a functionality of the RoomBox. As shown in Figure 3, the RoomBox has a slot named 'event' which holds a current user-operation event on its descendant boxes. Some specific user-operation events generated in a RoomBox are always stored in this slot until the next event is generated. As shown in Figure 3, IntelligentBox provides a distributed model-sharing mechanism. It enables multiple, distributed RoomBoxes to virtually share a common model with each other through messages passed via network. By the distributed model sharing mechanism, multiple RoomBoxes can share user-operation events with each other. Here, descendant boxes of a RoomBox are treated as collaboratively operable 3D objects. In the example case shown in Figure 3, there are two RoomBox models existing separately on a different computer. These two models are kept in the same state by messages passed via network. This linkage is built easily and rapidly by making a shared copy of a RoomBox on one computer and by transferring it to the other computer. When a user operates one box in the RoomBox on the computer A, his operation event is sent to the RoomBox model and subsequently set in its event slot. Furthermore, this event is sent to the other RoomBox model existing on the computer B by a message. After these processes are completed, the operation event is applied to the corresponding box on the computer B. In this way, by using distributed RoomBoxes , user-operation events are shared among several computers.

3.1 The design of MovieBox and VideoBox


Both MovieBox and VideoBox use the texture-mapping technique to display a 2D image 'in a 3D virtual space. Strictly speaking, a texture image is mapped on the surface of a 3D object, i.e., a box. Then each frame of a movie file stored in a storage or each frame of video data captured by a video camera appears on a computer screen. As shown in Figure 4 , both MovieBox and VideoBox have a 'frame' slot in its model. A texture image is loaded and stored in this slot. To periodically update content of the slot allows us to see an animation.

3.1.1 MovieBox
The left figure in Figure 4 illustrates data flow concerning the MovieBox. Indeed the MovieBox has the following slots in its model: 'moviefile', 'currentNo', 'increment', 'TRIGGER', and 'frame'. A 'moviefile' slot value is a name of a movie file requested to play. A 'currentNo' slot stores an index number of a current displayed frame. The next frame to be displayed is determined by the next 'currentNo' slot value. The next 'currentNo' slot value becomes a sum of a current 'currentNo' slot value and an 'increment' slot value. Whenever a 'TRIGGER' slot is accessed, a 'currentNo' slot value is updated and the next frame will be loaded. Actually a TimerBox is used to access the 'TRIGGER' slot periodically by a slot-connection since a

3 MovieBox and VideoBox for video communications in collaborative virtual environments


As previously mentioned, with using RoomBoxes, it is possible to build collaborative virtual environments easily and rapidly. Based on RoomBoxes, several distributed 3D

377

MovieBox

1 1

VideoBox

I
QI
I
Display o b i e ~ f

1-Itil I
I

i I I i
trigger

-P

Display Object

Model

Mdol

Figure 4: VideoBox

The mechanism of MovieBox and

Figure 5: The mechanism of video communication

TimerBox has a slot named 'time', and the 'time' slot value is automatically increased by a timer process. Finally a 'frame' slot stores a binary 2D image data.

as the same way as other user-operation events. Then video communications became possible in collaborative virtual environments built based on the RoomBox.

312 ..

VideoBox

3.3 AudioBox for audio communications


For developing actual applications, audio and video communications are both necessary. We have almost finished implementation of software components, i.e., the VideoBox and MovieBox. For audio communications, we have not yet implemented. IntelligentBox provides a particular box called SoundBox for playing a sound file but this box is not available on network. We are supposed to improve the SoundBox to make it available on network, and we are trying to introduce a new box called AudioBox for audio communications. Essential mechanisms of AudioBox will be the same as VideoBox. After realizing these boxes, it would be possible to develop practical applications shown in the next section.

The right figure in Figure 4 illustrates data flow concerning the VideoBox. Indeed the VideoBox has the following slots in its model : 'DriverNo', 'TRIGGER', and 'frame'. A 'DriverNo' slot value indicates a video capture driver since multiple video capture drivers can be installed. Whenever a 'TRIGGER' slot is accessed, the VideoBox captures new video frame data and stores it into the 'frame' slot. As well as the case of MovieBox, a TimerBox is used to access the 'TRIGGER' slot periodically. Finally a 'frame' slot stores a current frame, which is a binary 2D image data. Figure 4 shows the mechanism of MovieBox and VideoBox. The difference between two boxes is their source where a texture image data is generated, i.e., in case of MovieBox, its source is a movie file and in case of VideoBox, its source is a video capture camera.

4 Discussion
This section discusses possible, practical application examples, their development costs and performances.

3.2 Extentions of MovieBox and VideoBox for collaborative virtual environments


Before introduction in this paper, MovieBox and VideoBox have already been implemented. However their distributed shared copies were not possible because the 'frame' slot has not a texture image data but an index number indicates a content of a texture table for saving memory area. This time, to realize video communications, we modified the ' frame' slot to be able to store an image data itself. Then it became possible to make a shared-copy of the MovieBoxlVideoBox and transfer it to another computer to build a video communication environment as shown in Figure 5. Furthermore we improved the MovieBoxlVideoBox in order to cope with the RoomBox. Remember that the RoomBox treats user-operation events of its descendant boxes and multiple shared copies of a RoomBox provide virtually a collaborative environment. We made it possible that the RoomBox treats the event of updating the 'frame' slot value of the MovieBoxlVideoBox

4.1 Practical application examples


( 1)Teleconference systems

Figure 6 shows a screen image of a teleconference system. A conference attendant is represented as a robot. Each robot's head is the combination of a CameraBox and a VideoBox . In this example, conference attendants are five but four robots are seen since the screen image of the figure is the camera view of the remaining robot controlled by the user using the computer of this screen. All boxes are software components so that even endusers can construct this application example only through direct manipulations, i.e., copy-and-transfer operations, on a computer screen. Strictly speaking, in this example, each of multiple attendants makes four copies of his own robot and mutually transfers each copy to each of other

378

Figure 6: A teleconference system example computers. Then a video communication environment like this will be built. (2)Tutoring systems Figure 7 shows a screen image of a tutoring system. The model appearing on the main window of IntelligentBox means a printer consisting of a top cover, a back cover, a paper tray and a toner cartridge. In this example case, the user using the computer of this screen is a learner and the person whose face appearing on the screen is a tutor. The tutor teaches the learner a component structure and the way to manipulate a printer, e.g., open a top cover and remove a toner cartridge from the inside, draw a paper supply, and open a back cover to remove a jammed paper. User-operation events are shared by using RoomBox and users voices and facial images are shared by using VideoBoxlAudioBox so that the tutor can teach the printer operations by live demonstrations with oral explanations.

Figure 7: A tutoring system example

4.3 Performances
Performance of video communication depends on such measures as frame rate, frame image quality, and network bandwidth. The frame rate is the number of frames to be displayed per a unit time. The frame image quality is determined by two factors, image sizelresolution and information quantity of one pixel color. Of course, higher frame rate and higher frame image quality are both desirable if network bandwidth allows. However such a situation is rare. Therefore we introduced the protocol slot into MovieBox and VideoBox as one solution so that users can choose a protocol, TCP or UDP, according to the network bandwidth they can use. Furthermore we will make it possible that users can directly choose the frame rate and frame image quality according
to application types.

4.2 Development costs


We discribe as follows the simplest way to create an application for video communication using VideoBox, RoomBox and TimerBox, to compose a composite box from a TimerBox and a VideoBox, to connect the time slot of the TimerBox and the TRIGGER slot of the VideoBox by a slot-connection through a menu selection, to define the composite box as a descendant of a RoomBox, and to make a copy of the RoomBox and transfer it to another computer. In this way, construction process for video communication, based on a copy-and-transfer operation, is very simple and easy for even end-users. The application examples except videolaudio communication facilities presented in this paper were actually developed in less than a few hours without writing text-based programs. Those were constructed only through direct manipulations on a computer screen using a mouse device.

Here we consider the relation among the frame rate, frame image quality and the network bandwidth. In IntelligentBox , information quantity of one pixel color of a texture image is four bytes, RGBA. R, G , and B mean red, green, and blue color values respectively. A means a value which is a transparency factor. In fact, the a value is not used. Then information quantity of one pixel color is reduced into three bytes. In the case that users desire the real-time communication, i.e., 30 fps, and the higher resolution of one frame image, i.e., 256 x 256 pixels, the necessary bandwidth is calculated as follows: 256 x 256 x 3 x 8 x 30 = 45 Mbps(bit per second) Generally we are not allowed to use a high bandwidth network like this. But if the frame rate requirement is decreased to 20 fps, the necessary bandwidth becomes 30 Mbps ( = 256 x 256 x 3 x 8 x 20). Additionally if the resolution requirement is decreased to 128 x 128 pixels, the necessary bandwidth becomes 7.5 Mbps ( = 128 x 128 x 3 x 8 x 20). Furthermore if information quantity of one pixel color is decreased to one bye, i.e., using a packed color, the necessary bandwidth becomes 2.5 Mbps. Currently we are experimenting real-time network collaborations using an ATM communication network. We can use it in up to 30Mbps bandwidth as a virtual dedicated

379

communication line. We will check that real-time communications including video/audio communications are possible without any stress. We will report these findings in the near future. We have been developing three different versions of IntelligentBox corresponding to machine environments, i.e., a Windows version, a FreeBSDLinux version, and an SGI (IRIX OS) version. Now we are developing mainly Windows and FreeBSDLinux versions. In the case of the FreeBSDLinux version, X window system seems to be a. bottleneck since the Windows version is faster than the FreeBSDLinux version as for its rendering speed. This reason is obvious. XFree86, X window system of our FreeBSDLinux environment, does not support OpenGL nor Mesa3D hardware accelerations. Especially the texture-mapping technology strongly needs a hardware acceleration support. IntelligentBox employs the unicast communication. In case of multiple-attendants communication, loads of computers become heavy and traffic of network becomes high. Then we are trying to introduce the multicast communication instead of the unicast. We will report performances in detail after realizing this.

References
[ I ] Banus JW, Waters RC, Anderson DB, Locals and Beacons:Efficient and Precise Support For Large Multi-User Virtual Environments, Proc. of IEEE Virtual Reality Annual International Symposium, Santa Clara, CA, pp. 204213. 1996.

[2] Fahlen LE, Stahl 0, Brown CG, Carlsson C, A Space Based Model for User Interaction in Shared Synthetic Environments, ACM INTERCHI 93 New York: pp. 43-48. 1993. [3] Gajewska H, Kistler J, Manasse MS, Redell DD, Argo:A System for Distributed Collaboration, ACM Multimedia 94: pp. 433-440, 1994. [4] Honda, S., Tomioka, H., Kimura, T., Ohsawa, T., Okada, K. and Matsushita, U,, A Virtual Office Environment Based on a Shared Room Realizing Awareness Space and Transmitting Awareness Information, Proc. of ACM UlST97, pp. 199-207, 1997.
[ 5 ] Ishii H, Kobayashi M, Grudin J, Integration of Interpersonal Space and Shared Workspace:ClearBoardDesign and

Conclusing remarks

Experiments, ACM Transactions on InformationSystems, Vol. 1 1, NO. 4, pp. 349-375. [6] Okada,Y. and Tanaka,Y., 1ntelligentBox:A Constructive Visual Software Development System for Interactive 3D Graphic Applications, Proc. of Computer Animation !Q5, IEEE Computer Society Press, pp. 114-125, 1995.

This paper presented the component based video communication tool for collaborative virtual environments. Especially the new concept, i.e., the copy-and-transfer OQeration was proposed and its realization mechanisms were explained. If a software component is represented as a visible, manually operable object, the copy-and-transfer operation to such an object becomes possible. Then if a facility that manages video/audio data is realized as such a visible, manually operable object, even end-users can easily and rapidly build networked video communication environments through the copy-and-transfer operation. To clarify an availability of this concept, we employed the IntelligentBox system and introduced a video communication facility as software components. This paper clarified the availability of the copy-and-transfer operation by discussing development costs and performances of applications with showing possible, practical application examples.

[7] Okada,Y. and Tanaka,Y., Collaborative Environments in IntelligentBox for Distributed 3D Graphic Applications, The Visual Computer (CGS special issue), Vol. 14, No. 4, pp. 140-152, 1998.
[SI Streitz NA, GeiBler J, Haake JM, Hol J , DOL-

PH1N:Integrated Meeting Support across LiveBoards, Local and Remote Desktop Environments, Proc. of ACM Conference on Computer-Supported Cooperative Work(CSCW 94), New York: pp. 345-358, 1994. [9] Tanaka, Y., Meme Media and a World Wide Meme Pool, Proc. of ACM Multimedia 96, pp. 175-186, 1996.
[IO] Yoshida, M., Tijerino, A. Y., Abe, S. and Kishino, F.,

Acknowledgements
We would like to thank all members of our laboratory for their advices and suggestions. We also thank all members of a special interest group on networks at Kyushu University.

A Virtual Space Teleconferencing System that Supports Intuitive Interaction for Creative and Cooperative Work, Proc. Of ACM Symposium on Interactive 3D Graphics, pp. 115-122, 1995.
[ I 11 http://www-nrg,ee.lbI.gOv/

380

You might also like