Professional Documents
Culture Documents
ABSRACT
Skin diseases are becoming a common phenomenon these days as different types of
allergies are increasing rapidly. Most skin diseases tend to pass on from one person to
another and therefore it is important to control it at initial stages to prevent it from
spreading. In this paper, we study the problem of skin disease automated detection and
provide the user advises or treatments based on the results obtained in a shorter time
period than the existing methods. We will be constructing a diagnosis system based on
the techniques of image processing and data mining. We will be making use of Matlab
software to perform the pre-processing and processing of the skin images which will be
obtained from the given data set.
CHAPTER 1
INTRODUCTION
In a (8 bit) grey scale image each picture element has an assigned intensity the
ranges from 0 to 255. A grey scale image is what people normally call a black and
white image, but the name emphasizes that such an image will also include many
shades of grey.
Each pixel has a value from 0(black) to 255(white). The possible range of the pixel
values depend on the color depth of the image, here 8 bit=256 tones or grey
scales.A normal grey scale image has 8 bit colour depth=256 grey scales. A “true
Some grayscale images have more grayscale, for instance 16 bit= 65536
grayscales. In principle three grayscale images can be combined to form an image
with 281,474,976,710,656 grayscales.
There are two general groups of ‘images’: vector graphics or line art and
bitmaps(pixel-based or’images’).
1.1.2Perception of colors
Image based on radiation from the EM spectrum are the most familiar,
especially image in the X-ray and visual bands of the spectrum. Electromagnetic
waves can be conceptualized as propagating sinusoidal waves of varying
wavelengths , or they can be thought of as a stream of mass less particles, each
traveling wavelike pattern and moving speed of light. Each bundle of energy is
called a photon. If spectral bands are grouped according to energy per photon, we
obtain the spectrum shown in figure ranging from gamma rays at one end to radio
waves at the other. The bands are shown shaded to convey the fact that bands of
EM spectrum are not distinct but rather transition smoothly from one to the other.
Figure4: The electromagnetic spectrum arranged according to energy per
photon
1.2 Image File Formats
Image file formats are standardized means of organizing and storing images.
This entry is about digital image formats used to store photographic and other
images. Image files are composed of either pixel or vector (geometric) data that are
rasterized to pixels when displayed (with few exceptions) in a vector graphic
display. Including proprietary types, there are hundreds of image file types. The
PNG, JPEG, and GIF formats are most often used to display images on the
Internet.
JPEG/JFIF:
JPEG (Joint Photographic Experts Group) is a compression method. JPEG
compressed images are usually stored in the JFIF (JPEG File Interchange Format)
file format. JPEG compression is lossy compression. Nearly every digital camera
can save images in the JPEG/JFIF format, which supports 8 bits per color (red,
green, blue) for a 24-bit total, producing relatively small files. Photographic
images may be better stored in a lossless non-JPEG format if they will be re-edited,
or if small "artifacts" are unacceptable. The JPEG/JFIF format also is used as the
image compression algorithm in many Adobe PDF files.
EXIF:
The EXIF (Exchangeable image file format) format is a file standard similar
to the JFIF format with TIFF extensions. It is incorporated in the JPEG writing
software used in most cameras. Its purpose is to record and to standardize the
exchange of images with image metadata between digital cameras and editing and
viewing software. The metadata are recorded for individual images and include
such things as camera settings, time and date, shutter speed, exposure, image size,
compression, name of camera, color information, etc. When images are viewed or
edited by image editing software, all of this image information can be displayed.
TIFF:
The TIFF (Tagged Image File Format) format is a flexible format that normally
saves 8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-bit totals,
respectively, usually using either the TIFF or TIF filename extension. TIFFs are
lossy and lossless. Some offer relatively good lossless compression for bi-level
(black & white) images. Some digital cameras can save in TIFF format, using the
LZW compression algorithm for lossless storage. TIFF image format is not widely
supported by web browsers. TIFF remains widely accepted as a photograph file
standard in the printing business. TIFF can handle device-specific color spaces,
such as the CMYK defined by a particular set of printing press inks.
PNG:
The PNG (Portable Network Graphics) file format was created as the free,
open-source successor to the GIF. The PNG file format supports true color (16
million colors) while the GIF supports only 256 colors. The PNG file excels when
the image has large, uniformly colored areas. The lossless PNG format is best
suited for editing pictures, and the lossy formats, like JPG, are best for the final
distribution of photographic images, because JPG files are smaller than PNG files.
PNG, an extensible file format for the lossless, portable, well-compressed storage
of raster images. PNG provides a patent-free replacement for GIF and can also
replace many common uses of TIFF. Indexed-color, grayscale, and true color
images are supported, plus an optional alpha channel. PNG is designed to work
well in online viewing applications, such as the World Wide Web. PNG is robust,
providing both full file integrity checking and simple detection of common
transmission errors.
GIF:
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256
colors. This makes the GIF format suitable for storing graphics with relatively few
colors such as simple diagrams, shapes, logos and cartoon style images. The GIF
format supports animation and is still widely used to provide image animation
effects. It also uses a lossless compression that is more effective when large areas
have a single color, and ineffective for detailed images or dithered images.
BMP:
The BMP file format (Windows bitmap) handles graphics files within the
Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are
large. The advantage is their simplicity and wide acceptance in Windows
programs.
As opposed to the raster image formats above (where the data describes the
characteristics of each individual pixel), vector image formats contain a geometric
description which can be rendered smoothly at any desired display size.
CGM:
CGM (Computer Graphics Metafile) is a file format for 2D vector graphics,
raster graphics, and text. All graphical elements can be specified in a textual source
file that can be compiled into a binary file or one of two text representations. CGM
provides a means of graphics data interchange for computer representation of 2D
graphical information independent from any particular application, system,
platform, or device.
SVG:
SVG (Scalable Vector Graphics) is an open standard created and developed
by the World Wide Web Consortium to address the need for a versatile, scriptable
and all purpose vector format for the web and otherwise. The SVG format does not
have a compression scheme of its own, but due to the textual nature of XML, an
SVG graphic can be compressed using a program such as grip.
1.3 Digital images
Suppose we take an image, a photo, say. For the moment,lets make things
easy and suppose the photo is black and white(thar is, lots of shades of grey),so no
colour. We may consider this image as being a two dimensional function where the
functional values give the brightness of the image at any given point. We may
assume that in such an image brightness values can be any real numbers in the
range (black)(white).
A digital image from a photo in that the values all are discrete. Usually they
take an only integer values. The brightness values also ranging from 0(black) to
255(white). A digital image can be considered as a large array of discrete dots,
each of which has a brightness associated with it. These dots are called picture
elements, or more simply pixels. The pixels surrounding a given pixel constitute its
neighborhood. A neighborhood can be characterized by its shape in the same way
as a matrix : we can speak of a neighborhood,. Except in very special
circumstances neighborhood have odd numbers of rows and columns; this ensures
that the current pixels is in the center of the neighborhood.
Image pre-processing is the term for operations on images at the lowest level of
abstraction. These operations do not increase image information content but they
decrease it if entropy is an information measure. The aim of pre-processing is an
improvement of the image data that suppresses undesired distortions or enhances
some image features relevant for further processing and analysis task. Image pre-
processing use the redundancy in images. Neighboring pixels corresponding to one
real object have the same or similar brightness value. If a distorted pixel can be
picked out from the image, it can be restorted as an average value of neighboring
pixels. Image pre-processing methods can be classi¯ed into categories according to
the size of the pixel neighborhood that is used for the calculation of a new pixel
brightness. In this paper, it will be presented some pixel brightness transformations
and local pre-processing methods realized in MatLab.
1.7.9 Knowledgebase:
1.8.2 Agriculture
1.8.3 Industry
Fingerprint analysis,
Sharpening or de-blurring of speed-camera images.
Existing method:
In this paper we propose a diagnosis system which will enable users to detect and
recognize skin diseases. With the help of image processing and data mining techniques
and provide the user advises or treatments based on the results obtained in a shorter
time period than the existing methods. In this project, we will be constructing a
diagnosis system based on the techniques of Image Processing. We will be making use
of Matlab software to perform the pre-processing and processing of the skin images of
the users.
This processing will be conducted on the different skin patterns and will be analyzed to
obtain the results from which we can identify which skin disease the user is suffering
from. This data will help in early detection of the skin diseases and in providing their
cure. Through this we will be finding a cost effective and feasible test method for the
detection of skin disorders. The results obtained will be classified according to the given
prototype and diagnosis accuracy assessment will be performed to provide users with
efficient and fast results.
In this paper we are considering a train of images that will be obtained from the given
data set and preprocessing and segmentation will be performed on each image. After
the image is segmented we are able to determine whether the skin has been affected
by any disease or not. We are taking into consideration three disease viz., psoriasis,
vitiligo and skin cancer. Once the presence of disease is detected the portion of area
affected by the disease will be highlighted indicating the exact location of the disease on
the skin. From the affected area we will perform classification of disease through data
mining. The segmentation of image is done using a tolerance value. The tolerance
value is calculated through histogram.
Block diagram:
Dataflow diagram:
CHAPTER 2
PROJECT DECSCRIPTION
Skin cancer is a deadly disease. Skin has three basic layers. Skin cancer begins in
outermost layer, which is made up of first layer squamous cells, second layer basal
cells, and innermost or third layer melanocytes cell. In today’s world, people of different
age groups are suffering from skin diseases such as eczema, scalp ringworm, skin
fungal, skin cancer of different intensity, psoriasis etc. These diseases strike without
warning and have been one among the major disease that has life risk for the past ten
years. If skin diseases are not treated at earlier stage, then it may lead to complications
in the body including spreading of the infection from one individual to the other.
The skin diseases can be prevented by investigating the infected region at an early
stage. It is important to control it at initial stages to prevent it from spreading. Also
damage done to the skin through skin diseases could damage the mental confidence
and wellbeing of people. Therefore this has become a huge problem among people and
it has become a crucial thing to treat these skin diseases properly at the initial stages
itself to prevent serious damage. Many of the skin diseases are very dangerous,
particularly if not treated at an early stage. Skin diseases are becoming common
because of the increasing pollution. Skin diseases tend to pass from one person to
another. Human habits tend to assume that some skin diseases are not serious
problems. Sometimes, most of the people try to treat these infections of the skin using
their own method. However, if these treatments are not suitable for that particular skin
problem then it would make it worse. And also sometimes they may not be aware of the
dangerousness of their skin diseases, for instance skin cancers. With advance of
medical imaging technologies, the acquired data information is getting so rich toward
beyond the human’s capability of visual recognition and efficient use for clinical
assessment.
Input image
Input to proposed system is dermoscopic images, dermoscopic images are images
taken by dermatoscope. It is kind of magnifier used to take pictures of skin lesions (body
part). It is hand held instrument make it very easier to diagnose skin disease.
Pre processing
Grayscale conversion
Grayscale image contains only brightness information. Each pixel value in grayscale
image corresponds to an amount or quantity of light. The brightness graduation can be
differentiated in grayscale image. Grayscale image measures only light intensity. 8 bit
image will have brightness variation from 0 to 255 where ‘0’ represents black and ‘255’
represents white. In grayscale conversion colour image is converted into grayscale.
Grayscale images are easier and faster to process than coloured images. All image
processing technique are applied on grayscale image. In our proposed system coloured
or RBG image is converted into grayscale image by using weighted sum.
Noise Removal
The objective of noise removal is to detect and removed unwanted noise from digital
image. The difficulty is in deciding which features of an image are real and which are
caused by noise. Noise is random variations in pixel values. In our proposed system we
are using median filter to remove unwanted noise. Median filter is nonlinear filter, it
leaves edges invariant. Median filter is implemented by sliding window of odd length.
Each sample value is sorted by magnitude, the center most value is median of sample
within the window, is a filter output.
Image enhancement
Segmentation
Segmentation is process of removing region of interest from given image. Region of
interest containing each pixel similar attributes. Here we are using maximum entropy
thresholding for segmentation. First of all we have to take gray level of original image
then calculate histogram of gray scale image then by using maximum entropy separate
foreground from background. After maximum entropy we obtained binary image that is
black and white image.
Feature extraction
Feature extraction plays an important role in extracting information present in given
image. Here we are using gray level co-occurrence matrix. (GLCM). GLCM for texture
image analysis. GLCM is used to capture spatial dependency between image pixels.
GLCM works on gray level image matrix to capture most common feature such as
contrast, mean, energy, homogeneity.
The purpose of feature extraction (glcm) is to suppress the original image data set by
measuring certain values or features that helps to classify different images from one
another.
Classifier
Classifier is used to classify cancerous image from other skin diseases. For simplicity
Support Vector machine classifier is used here. SVM takes set of images and predicts
for each input image belongs to which of the two categories of cancerous and non-
cancerous classes. The purpose of SVM is create hyper plane that separates two
classes with maximum gap between them. In our proposed system output of GLCM is
given as input to SVM classifier which takes training data, testing data and grouping
information which classifies whether given input image is cancerous or non-cancerous.
•
CHAPTER 3
SOFTWARE DESCRIPTION
Software Requirement:-
5.2 Syntax:
5.2.1 Variables:
MATLAB does not have the easy to use interfaces of Adobe Photoshop.
MATLAB is used to test and tweak new image processing techniques and
algorithms. Almost everything in MATLAB is done through programming and
manipulation of raw image data and not a user interface. The effects and filters in
Photoshop (or any other image editing software) are actually algorithms. With
MATLAB, the user can create these complex algorithms that are applied in
Photoshop.
5.3.1 Image matrices:
MATLAB does have casting functions unit 8 () and double (). But these
only change the data type and do not scale the values. Scaling must be done
manually. The reason MATLAB has two formats is because uint8 values take less
storage. But in many older versions of MATLAB (version 6.0) direct arithmetic
operations on uint8 values is not possible because of accuracy issues. Therefore to
perform arithmetic operations, the pixel values must be converted to double first.
In version 2006a, this is not an issue as MATLAB simply changes uint8 to double
first, does the operations, and then changes the values back to uint8.
5.9Uses of MATLAB
CONCLUSION:
An automated skin disease detection system is proposed which will help the medical
society for the early detection of the skin diseases. The diagnosis methodology uses
Digital Image Processing in MATLAB. The unique features of the enhanced images
were segmented using histogram. Based on the results, the affected area is detected
and the skin diseases are classified
REFERENCES
[1] Mugdha Smanerkar, shashwata harsh, juhi saxena, simanta p sarma, dr. u.
snekhalatha, dr.m. anburajan, “classification of skin disease using multi svm
classifier” 3rd international conference on electrical, electronics, engineering
trends, communication, optimization and sciences (eeecos)-2016.
[2] G.Ramya, J.Rajeshkumar “Novel method for segmentation of skin lesions from
digital images”, international research journal of engineering and technology
vol:02 issue:08 November 2015.
[3] B.Gohila vani et al. “Segmentation and Classification of Skin Lesions Based on
Texture Features” Int. Journal of Engineering Research and Applications
www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 12( Part 6), December 2014,
pp.197-203.
[4] Kawsar Ahmed, Tasnuba Jesmin, Md. Zamilur Rahman “Early Prevention and
Detection of Skin Cancer Risk using Data Mining” International Journal of
Computer Applications (0975 – 8887) Volume 62– No.4, January 2013.
[5] I.Vijaya M. S “Categorization of Non-Melanoma Skin Lesion Diseases Using
Support Vector Machine and Its Variants”. International Journal of Medical
Imaging. Vol. 3, No. 2, 2015, pp. 34-40. doi: 10.11648/j.ijmi.20150302.15
[6] Y.P.Gowaramma et al., used marker controlled watershed segmentation method
k-nn classifier along with curvelet filter.
[7] J. Priyadharshini “A Classification via Clustering Approach for Enhancing the
Prediction Accuracy of Erythemato-squamous (Dermatology) Diseases” IJSRD -
International Journal for Scientific Research & Development| Vol. 3, Issue 06,
2015 | ISSN (online): 2321-0613.
[8] E.Barati et al., “A survey on utilization of data mining approach for
dermatological skin diseases prediction” Journal of selected areas in health
informatics march 2011.
[9] A.A.L.C. Amarathunga, et al.,”Expert system for diagnosis of skin diseases”
International journal of scientific & technology research volume 4, issue 01,
january 2015 issn 2277-8616 174 ijstr©2015.
[10] MadhuraRambhajani “Classification of Dermatology Diseases through Bayes
net and Best First Search” International Journal of Advanced Research in
Computer and Communication Engineering” Vol. 4, Issue 5, May 2015.