Professional Documents
Culture Documents
Mingtao SUN
Australia National University Comp 6701 eScience Project June 2005
Table of Contents
Mingtao SUN........................................................................................................................................1 Comp 6701 eScience Project..................................................................................................1 Table of Contents..................................................................................................................................2 1 Introduction.......................................................................................................................................4 1.1Overview..................................................................................................................................4 1.1.1Background.....................................................................................................................4 1.1.2Purpose...........................................................................................................................4 1.1.3Scope..............................................................................................................................4 1.1.4Project Objectives...........................................................................................................4 1.2Stakeholders ............................................................................................................................5 1.2.1Client..............................................................................................................................5 1.2.2Supervisor ......................................................................................................................5 1.2.3Developer ......................................................................................................................5 1.3 Deliverables.............................................................................................................................5 1.4 Schedule .................................................................................................................................6 1.5 Document Organization...........................................................................................................7 2Requirements Analysis.......................................................................................................................8 2.1Detailed User Requirements.....................................................................................................8 2.1.1Functional Requirements................................................................................................8 2.1.2Non-functional Requirements........................................................................................8 2.2 Tools......................................................................................................................................10 2.2.1 Hardware.....................................................................................................................10 2.2.2 Software.......................................................................................................................10 2.2.3 Programming language................................................................................................10 2.2.4 Operating System........................................................................................................10 3 Modelling.........................................................................................................................................11 3.1 Domain Chart.........................................................................................................................11 3.2 Domain Description...............................................................................................................12 3.2.1 ImageManipulation Domain........................................................................................12 3.2.2 GUI Domain................................................................................................................14 3.2.3 ImageResource Domain..............................................................................................15 3.2.4 Database Domain.........................................................................................................15 4 Implementation................................................................................................................................16 4.1 Automating image capture with the webcam........................................................................16 4.2 Background Removal............................................................................................................19 4.3 Image Scaling........................................................................................................................27 4.4 Feature Detection...................................................................................................................30 4.5 Image Processing in Java.......................................................................................................31 4.5.1 Loading/Saving an image from/to a file......................................................................31 4.5.2 Scaling an image..........................................................................................................31
2
4.5.3 Processing a pixel........................................................................................................32 4.6 JPEG or PNG?.......................................................................................................................33 ...........................................................................................................................................................33 5 Testing.............................................................................................................................................34 5.1 Testing Plan...........................................................................................................................34 5.2 Testing Specifications............................................................................................................36 6 Project Evaluation...........................................................................................................................38 6.1 Current Outcome...................................................................................................................38 6.2 Pitfalls....................................................................................................................................40 6.3 Further Suggestions...............................................................................................................41 7 Conclusion.......................................................................................................................................42 8 References.......................................................................................................................................43 Appendix A: Schedule........................................................................................................................44 Appendix B: Work Breakdown Structure...........................................................................................46 Appendix C: User Manual..................................................................................................................47 C.1 Installation............................................................................................................................47 C.2 Menu Commands..................................................................................................................47 C.2.1 File..............................................................................................................................47 C.2.2 Image..........................................................................................................................48 C.2.3 Configure....................................................................................................................48 C.3 FAQ.......................................................................................................................................48 C.3.1 How can I start collecting Images?.............................................................................49 C.3.2 How can I remove the background of images?...........................................................49 C.3.3 How can I do background removal settings?..............................................................49 C.3.4. Do I have to do background removal settings every time?........................................51 C.3.5 How can I pre-process the images?............................................................................51 C.3.6 How can I detect feature of the images?.....................................................................51 C.3.7 Is it possible to select the whole directory as the input?.............................................51 C.3.8 How can I delete the image files?...............................................................................51 C.3.9 Which directory stores the image files?......................................................................51 ...........................................................................................................................................................52
1 Introduction
1.1 Overview
1.1.1 Background
Face recognition has received significant attention during the past several years. Even though current face recognition systems have reached a certain level of maturity, their success is limited by certain conditions. For example, recognition of face images acquired in an outdoor environment with changes in illumination remains a large unsolved problem. (Zhao, Chellappa 2003) Having seen the problems of face recognition, it is also necessary to realize the importance of the preparation work. The face recognition research usually needs lots of face images; what is the way to acquire those photos? Digital camera seems to be a choice, but exporting photos to the computer would become a tedious task. So question one is how to quickly get a set of photos of a person. Question two is how to remove the background of hundreds of photos easily since only the face in the photo counts. Manually doing these tasks is obviously out of the question, so here comes the need for this project. The ultimate goal is to allow researchers to get the kinds of images they want as easily as possible. This project does not involve any core algorithm of face recognition. However, it provides a solid foundation for further research in this field. This project consists of two parts: one is acquiring photos from a web camera and the other one is pre-processing the images, including removing the background, scaling the image to a standard size, making the number of pixels in a image consistent, and so on. This project also covers a little about feature detection.
1.1.2 Purpose
The purpose of doing this project is to learn about image processing techniques and to improve research techniques.
1.1.3 Scope
The scope of this project covers acquiring and pre-processing images for further processing in face recognition field and does not cover implementing any face recognition algorithm.
automates face image capture with a web camera preprocesses the captured images (removes background, scales size, and so on)
1.2 Stakeholders
1.2.1 Client
Professor Tom Gedeon, tom@cs.anu.edu.au, (612) 612 51052
1.2.2 Supervisor
Mr Pascal Vuylsteker, pvk@vuylsteker.net, (612) 612 58192
1.2.3 Developer
Mr Mingtao Sun, jackysun820@hotmail.com, (61) 402556204
1.3 Deliverables
Deliverable Source Code Presentation Final Report 20 June 22nd June 24th June
th
Due Date
1.4 Schedule
This project started on 24th Feb 2005 and finished on 24th June 2005. Task 1. Analysing Clients Requirements. 1.1 Contacting the client and getting the initial requirements 1.2 Analysing the initial requirements 1.3 Discussing detailed requirements and acceptance criteria with the client 1.4 Selecting tools 2. Modelling 2.1 Developing domain diagram 2.2 Developing class diagram for each domain 3. Implementation 3.1 Automating image capture 3.2 Removing background 3.3 Scaling size of the image 3.4 Detecting feature of the face 4. Testing and Debugging 5. Documentation 5.1 Commenting source code 5.2 Writing final report More details can be seen in Appendix A. Expected Duration Actual Duration (days) (days) 1 3 1 1 2 4 20 14 5 12 5 4 15 1 3 1 1 2 4 18 20 8 2 8 4 15
2 Requirements Analysis
FR2
Mandatory
FR3
Mandatory
FR4
Mandatory
FR5
Optional
NFR2 NFR3
clicks on Taking a photo button, the snapshot would be done and the image file would be saved immediately. It should just need a one-click effort. Instructions for acquiring face images should make users clear enough as to Mandatory where they should look at when taking photos The software should be able to perform batch processing. That is, the user can Mandatory select multiple files or even a whole directory for processing.
2.2 Tools
2.2.1 Hardware
Macintosh G5 with three screens iSight web camera
2.2.2 Software
Microsoft Word is used for documentation. Microsoft Project 2000 is used for generating the Work Breakdown Structure. Jcreator is used for editing, compiling and running Java source code.
10
3 Modelling
ImageManipulation
11
1:1 Face
ImageManipulation Class: This class acts as an interface of this domain, all the inter-domain data communication must be through this class. This class does not involve any calculation. Settings Class: This class stores all the settings information that can be changed during the runtime of the software. This information includes the standard size of a pre-processed image, the number of pixels of an image, the settings for background removal, the default output directory and so on. Managing Class: This class manages users ID numbers, including deleting ID numbers and
12
relevant images. Preprocessing Class: This class pre-processes the images, including removes the background, scales the size of the image, and converts the image to grey scale. After pre-processing, the output images would have the same size and same number of pixels. ImageShop Class: This class serves as a utility class for Preprocessing class. It provides a set of useful methods for pre-processing images. Face Class: This class servers as a utility class for ImageShop class. It detects the boundary of the face in a image and counts the number of pixels in the image. Acquiring Class: This class acquires images from ImageResource Domain and saves the acquired image to database. This class also stores displaying parameters for AcquiringDialog Class in GUI domain. Grid Class: This class is used by both Acquiring Class and AcquiringDialog Class. This class represents a square on the screen, where a user would possibly look at when taking photos.
13
1:1 1:1 Settings Dialog Managing Dialog 1:1 Preprocessing Dialog AcquiringDi alog 1:1 1:* Grid
MainFrame Class: This class displays the main window when the software starts to run. It contains a menu bar and a window playing video streams that the web camera captures. SettingsDialog Class: This class displays a dialog window which allows users to change the default settings. ManagingDialog Class: This class displays a dialog window which allows administrator to delete the users ID numbers and their relevant images, to export files to another directory, and to preprocess the images. PreprocessingDialog Class: This class displays a dialog window which allows users to select multiple files or directories of files for pre-processing. AcquiringDialog Class: This class displays a dialog window which allows users to take photos with a web camera. This class also gives sufficient instructions to assist users to complete the acquiring face image process.
14
15
4 Implementation
16
The above is a snapshot during the runtime of the program. The little face icon in the pane is the position which the user should look at. Once the user is ready, he/she clicks on the shoot button.
The face icon will move to the next spot which the user should look at. The order is random in order to make sure that the user not only moves his/her eyes but also moves his/her head. So a line is drawn to point out where the next position is. It is necessary because the user tends to get lost easily when sitting in front of a three-screen computer.
17
The little face icon does not only serve as a position guide, but also tells the user what kind of expression to show. For example, the above face icon is a smiling one, and the user should be smiling when taking the current photo.
18
The ideal result, of course, is removing all the unwanted blue background and leaving the whole face integrated. Nevertheless, there is always a gap between the ideal world and the reality. So here is the compromise: the output image could have some blue pixels left but the bottom line is the face should be integrated. Damage to the neck and the ears could be tolerable to some extent, as they have insignificant contribution to face recognition. For example, have a close look at the right picture (output), and it is seen the left ear is missing a bit. However, it is still an acceptable output. Algorithm: Firstly removes the obviously blue pixels (sets those blue pixels to be white) Pseudo code: for(i=0, i<image.width,i++) for(j=0; j< image.height; j++) if(isBluePixel(getRGB(i,j))=true) setRGB(i,j, white); isBluePixel(int R, int G, int B) return true if RGB falls into BLUE_AREA; else return false;
19
BLUE_AREA: a set of RGB ranges in which the color is blue. (See the example for more details.) Secondly, for those remaining blue pixels, if the number of consecutive vertical or horizontal nonwhite pixels is less than a constant, it is assumed those pixels do not belong to the face. Pseudo code: for(i=0, i< image.height,i++) if (scanline contains less than MIN_FACE_WIDTH non-white pixels) these non-white pixels are set to be white
for(i=0, i< image.width,i++) if (vertical line contains less than MIN_FACE_HEIGHT non-white pixels) these non-white pixels are set to be white
experiments
are
needed
to
decide
MIN_FACE_WIDTH
and
Given good BLUE_AREA, MIN_FACE_WIDTH, and MIN_FACE_HEIGHT values, it suffices enough to remove the background. To improve the performance, the algorithm can be further modified. Since a face only takes up a small portion of the whole photo, it is wasteful to process all the horizontal and vertical lines. It is much better to start the processing from the middle of the photo, and once the program is convinced the current horizontal or vertical line does not cross the face, it can simply ignore the rest
Line 2
Line 1
Here is an example, suppose the program starts to process from line 1 and toward line 2. When it proceeds to line 2, it found out that line 2 does not cross the face, and then it stops processing and simply sets all the pixels between line 2 and the left side of the image to be white. Pseudo Code (modified from the second part of the above pseudo code): for(i= image.height/2, i< image.height,i++) cross_face=false
20
if (scanline contains less than MIN_FACE_WIDTH non-white pixels) these non-white pixels are set to be white else cross_face = true if (cross_face = false) break for(, i<photo.height, i++) set the whole scanline to be white pixels This pseudo code is only for processing the lower half of the image. However, the idea is the same for the rest. Smooth the face
After the previous steps, the above face would be a possible outcome. When the face is zoomed in, it turns out the outline of the face is not smooth. So the next step is to smooth the outline of the face. Here is the idea:
21
The above zigzag line represents part of the face outline. The idea is finding all the concave pixels (highlighted by the circles), drawing a line between two adjacent concave pixels, and cutting off the outstanding polygons. Here is the output:
Though it is not perfect yet, it is a lot better than the previous one. The real difficulty of removing background, as mentioned earlier, is that a series of experiments need to be carried out before determining the three important values: BLUE_RANGE, MIN_FACE_WIDTH and MIN_FACE_HEIGHT. Once these three values are determined, they can be reused for all the photos taken in the same environment. However, if the environment changes, for instance, the light in the room becomes a little dimmer, these three values may need to be reset. Hence, it is highly necessary to develop a program which is able to set these values easily. The next part will walk through an example to see how these values are set and how the background is removed step by step.
22
Example:
The left is the original image and the right is the control panel. Use the mouse to drag a rectangle on the blue background but keep it small. Later it can be seen how a big rectangle will ruin the face.
At the left-upper corner, the 31 103 51 150 138 255 represents the current BLUE_AREA. Mathematically, 31 is the lower limit for RED, 103 is the upper limit for RED, 51 and 150 for GREEN, and 138 and 255 for BLUE. Horizontal represents MIN_FACE_WIDTH and Vertical represents MIN_FACE_HEIGHT. The latter two values are currently 0 and they will be used once BLUE_AREA is determined. Having seen there are still large patches of blue pixels remaining, the previous step will be repeated. So again, drag a small rectangle on the blue area.
23
Now the BLUE_AREA is 31 103 51 150 138 255 plus 102 128 158 186 250 255. The dragging rectangle step continues until there is no big patch of blue pixels remaining.
Now that all patches of blue pixels are too small to drag a rectangle on, it is time to use the Horizontal and Vertical value. Set both of them to 10 and check the result.
24
It is good but it could be improved by adjusting the Horizontal value a bit larger.
Then click on the Save button to save these settings to a file. The background removal program reads the settings from the file and then is able to process all the photos taken in the same environment.
25
26
of pixels of the output image will be slightly greater than 3350. Then scale the image to the width of W and the height of H. Java has provided some methods to realize the scaling given the target width and targe height, and they will be covered in section 4.5. Step 2: Making the number of pixels of the face exactly equal to 3350. The idea is removing the least significant pixels. For easy implementation, it has been decided that the extra pixels be removed from the neck part. Technically, the program processes the image from bottom to top. It sets the current non-white pixel to be white if the total number of non-white pixels is greater than 3350. Pseudo code: remainingPixel=totalPixel-3350 for (x= image.height -1, x>=0, x--) for (y=0, y<image.width, y++) if(image.getRGB(x,y) is a non-white pixel) remainingPixel= remainingPixel-1 image.setRGB(x,y, white) if (remainingPixel==0) return One may ask that would it eventually ruin the face since the processing is from bottom to top? The answer is it is very unlikely to happen. Remember the input image for this step is actually the output image from step 1, and step 1 has already made the number of non-white pixels close to 3350 and there are only a few extra pixels to be removed. Therefore, it is safe to apply the above pseudo code. Step 3: Making the size of the image 80*80 pixels. The instinct for handling this is clipping the image into a square and scaling it to 80 by 80. Unfortunately, it will not work because the 3350-pixel condition cannot be guaranteed. The correct idea is cutting off the rectangle boundary of the face and adding extra white pixels to make the size 80 by 80. 70 70 80 80 Adding extra white pixels
the
28
The above is just an example, and its purpose is to familiarize readers with the idea. The actual face boundary cannot be exactly a square. Under normal circumstances, humans face should fit 80 by 80 pixels. However, there is a special case needed to take into consideration. If a persons face is so long that the height of the boundary exceeds 80, the exceeding part will be ignored. But there is nothing to worry about, since this is an extreme case and happens with an extremely low probability. Recognizing such an unusually shaped face should not be a problem for any face recognition system, and the loss of some small part of the image would not be a problem. The following is pseudo code to detect the rectangle boundary of the face: for (x=0, x<image.height, x++) for (y=0, y<image.width, y++) if (image.getRGB(x,y) is not white && xMin==0) xMin=x xMax=x yMin=y yMax=y else if (image.getRGB(x,y)! is not white && xMin!=0) xMax=x if(y<yMin) yMin=y if(y>yMax) yMax=y Step 4: Converting the image to grey scale. Suppose GREY is the grey value of the pixel, and R,G,B are RED, GREEN, BLUE value of the pixel respectively. GREY = R*0.3 + G*0.59 + B*0.11 Apply this formula to all the non-white pixels of the image.
29
30
31
32
jpeg format
png format
Use mspaint, the drawing program of Windows, to create a picture with a black rectangle centred and save it to jpeg format and png format respectively. Take a close look at the jpeg format image, especially the highlighted circle part. Make no mistake about it. This is not dust on the paper but some extra non-white pixels not supposed to be there. These extra non-white pixels will cause at least two problems: 1. The 3350-pixel requirement (FR2) would never be met. 2. The rectangle boundary of the face would be larger than it should be, and it would cause the face not to fit in the size of 80 by 80 pixels. There certainly will be other potential risks when those jpeg format images are further processed. Therefore, the above is sufficient to show that the png format should be used for image processing, needless to mention that the png format even takes up less hard disk space.
33
5 Testing
clear and descriptive for the user. All buttons and menu items on the actual interface should be tested to ensure that they operate to the clients satisfaction. Performance Testing: From an end users perspective, response time is the basic measure of performance used to judge the quality of the system. Administrators, on the other hand, are concerned not only with response time but also with the systems resource utilisation. Regression Testing: Regression testing is used when there are subsequent releases of the software, so as to ensure that the current release does not adversely affect the previous one. Reliability Testing: The reliability test is run over an extended period of time to make sure that the system does not exhibit any defects after substantial uptime and that it continues to perform within the desired response time.
35
3. Select source files. 4. Select target directory. 5. Click on OK button. Pass/Fail Criteria: Fail if the output files are not in the target directory. Fail if features of the face do not outstand for any of the output images. Test Result: Passed. Test ID: T4 Test Case Description: The system should allow the administrator to delete the users face images. Requirement ID: FR4 Test Steps: 1. Click on File menu. 2. Click on Manage menu item. 3. Select some user IDs. 4. Click on Delete button. Pass/Fail Criteria: Pass if the directories with those names are deleted. Test Result: Passed. Test ID: T5 Test Case Description: The system should give a warning if the user tries to input an existing user ID. Requirement ID: FR4 Test Steps: 1. Click on File menu. 2. Click on New User menu item. 3. Type an existing user ID. 4. Click on OK button. Pass/Fail Criteria: Pass if a warning message is displayed. Test Result: Passed.
37
6 Project Evaluation
FR2
Mandatory
Yes
FR3
Mandatory
Yes
FR4
Mandatory
Yes
FR5 NFR1
Optional Mandatory
No Yes
NFR2 NFR3
Mandatory Mandatory
Yes Yes
38
NFR4
Acquiring face images and pre-processing images should be able Optional to take place simultaneously. That is, when the current user is taking photos, the software can pre-process the photos of the previous user simultaneously.
No
FR5 is not completed due to the following three reasons: 1. FR1 has taken much more time to complete than expected because the client wanted the user interface very easy to operate on and hence added some more extra requirements. 2. The iSight web camera was not available until two weeks later than scheduled. 3. FR5 is in itself a huge, complex and unclarified requirement and it is difficult to predict how much time it would cost to complete it.
39
6.2 Pitfalls
There are two drawbacks in the current system: 1. Some settings need to be configured before the program removes the background of a series of images and the success for each image cannot be 100% guaranteed even if the photos have been taken in the same environment. This is because of the algorithm used to remove the background. 2. It takes too long (approximately 2 seconds) to pre-process an image. Since the code has not been optimised, there should be some space to improve.
40
41
7 Conclusion
Some useful conclusions can be drawn from this project: Java is really a powerful programming language for image acquiring and image processing. PNG is the format that should be used for image processing. Finally I would like to thank Professor Tom Gedeon and Mr Pascal Vuylsteker for their kind help and valuable suggestions.
42
8 References
C.Adamson (2003) Whats up with the Mac OS X Java and QuickTime? http://www.oreillynet.com/pub/wlg/2933 accessed Apr 7, 2005 W. Zhao, R.Chellappa, P.J.Phillips, and A.Rosenfeld (2000) Face Recognition: A Literature Survey
43
Appendix A: Schedule
44
45
Analysis
Modelling
Implementation
Testing
Documentation
Developing diagram
class
Automating Capture
Image
Detecting features
Removing background
Scaling size
Selecting tools
46
C.1 Installation
Prerequisites J2SE 1.4.2 http://java.sun.com/j2se/1.4.2/download.html Any web/digital/video camera with a firewall connection to a Macintosh machine On Macintosh operating system 1. 2. 3. 4. Unzip the installation file Type make clean Type make Type make run
On other operating systems Image acquisition function is not available to other operating systems. However, other image process functions are still available. The installation steps are same as those for Macintosh.
C.2.1 File
Menu Command New User Description Brings up an input dialog box. After the user inputs an ID, it will ask the user to choose an order to take photos. Then an instruction manual will display to help the user get through the image acquisition process. The user clicks the start button to start the image acquisition process. Brings up a manage dialog box. The administrator can use this dialog box to delete images, to export pre-processed images to target directory and to
47
Manage
Exit
C.2.2 Image
Menu Command Remove Background Description Removes the background of the selected images. The submenu Files allows selecting multiple files and the submenu Directories allows to select multiple directories. Pre-process the selected images. The submenu Files allows selecting multiple files and the submenu Directories allows to select multiple directories. Detect features of the selected images. The submenu Files allows selecting multiple files and the submenu Directories allows to select multiple directories.
Preprocess
Detect feature
C.2.3 Configure
Menu Command Default Settings Description Brings up a default settings dialog for changing options and settings related to image pre-processing. Background Removal Brings up a file choosing dialog box that allows selecting multiple image Settings files. These image files serve as sample files for configuring the values for background removal.
C.3 FAQ
1. 2. 3. 4. 5. 6. 7. 8. 9. How can I start collecting Images? How can I remove the background of images? How can I do background removal settings? Do I have to do background removal settings every time? How can I pre-process the images? How can I detect feature of the images? Is it possible to select the whole directory as the input? How can I delete the image files? Which directory stores the image files?
48
49
5. Change Horizontal and Vertical value to get better result 6. Use Undo or Remove button to correct mistakes 7. Check if the settings fit other sample files
50
51
52