You are on page 1of 5

2015 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME)

Bandung, November 2-3, 2015

Extraction and Classification Texture of


Inflammatory Cells and Nuclei in Normal
Pap smear Images
Dwiza Riana1, Dwi H. Widyantoro2 and Tati L Mengko2
1
Information System
STMIK Nusa Mandiri, Jakarta, Indonesia
(Tel: +62-817-771-998; E-mail: dwiza@bsi.ac.id)
2
School of Electrical Engineering & Informatics
Institut Teknologi Bandung, Bandung, Indonesia
(Tel: +62-22-2502260; E-mail: dwi@stei.itb.ac.id; tmengko@itb.ac.id)

Abstract- The presence of inflammatory cells complicates the features, namely texture, shape and intensity to classify the cell
process of identifying the nuclei in the early detection of cervical nucleus. None of those researches has observed about
cancer. Inflammatory cells need to be eliminated to assist identification of inflammatory cells.
pathologists in reading Pap smear slides. The texture of Grey- Many methods identify the location of nucleus in each
Level Run-Length Matrix (GLRLM) for inflammatory cells and
image. In Garrido et al., [6] a method which affects the excess
nuclei types are investigated. The inflammatory cells and nuclei
have different texture, and it can be used to differentiate them. To of edge points or overlapped objects in complex images was
extract all of the features, firstly manual cropping of proposed. In addition, genetic algorithms by Lassouaoui et al,
inflammatory cells and nuclei needs to be done. All of extracted [7], pixel classification (Baak et al, [8]), region growing (Mat
features have been analyzed and selected by Decision Tree Isa, [9]) and combining shape, texture and intensity features for
classifier (J48). Originally there have been eleven features in the cell nuclei extraction in pap smear images, Plissiti et al., [10]
direction of 135º which are extracted to classify cropping cells into were also proposed. All images were no deal with the existence
inflammatory cells and nuclei. Then the eleven features are of inflammatory cells in Pap smear images.
reduced into eight, namely low gray level run emphasis, gray level The presence of inflammatory cells in the background is a
non uniformity, run length non-uniformity, long run low gray-
common feature of conventional Pap smear images. There are
level emphasis, short run high gray-level emphasis, short run low
gray-level emphasis, long run high gray-level emphasis and run plenty of methods proposed in the literature for the analysis of
percentage based on the rule of classification. This experiment is Pap smear images. The problem of inflammatory cell
applied into 957 cells which were from 50 images. The identification has been addressed by several methods.
compositions of these cells were 122 cells of nuclei and 837 cells of The extraction and classification of inflammatory cells and
inflammatory. The proposed algorithm applied to all of the cells the nucleus have been investigated in several studies.
and the result of classification by using these eight texture features Muhimmah, et all [12] proposed method for the automated
obtains the sensitivity rates which show that there are still nuclei detection of cervical epithelial cell numbers in Pap smear
of cells that were considered as inflammatory cells. It was in images, which may contain overlapping nuclei and
accordance with the conditions of the difficulties faced by the
inflammatory cells.
pathologist while the specificity rate suggests that inflammatory
cells detected properly and few inflammatory cells are considered Riana [13] proposed method for extraction of inflammatory
as nucleus. cells with a combination of gray level thresholding and
Keyword: Classification, Cells Pap Smear, inflammatory cells, definitions of distance rule.
Texture, GLRLM, Decision Tree. In this paper we investigate texture features of inflammatory
cells and nuclei. The proposed features are Grey-Level Run-
I. INTRODUCTION Length Matrix (GLRLM) which consists of 11 texture features.
These features were applied in identification of inflammatory
Global Mortality per annum of cervical cancer are still very cells and nuclei.
high. It is a threat to the woman's life [1]. Early prevention of Eleventh texture features are used for classification into two
cervical cancer can be done with a Pap test, which is a key categories of inflammatory cells and nucleus by using Decision
stage of the accuracy of the identification of the cell nucleus Tree Learning algorithm (J48).
[2]. This paper is divided into several sections. In the section II,
Pap smear cell research is focused on identifying the nucleus it discusses about the material and methods used in this study
[3] - [10]. Mustafa N. et al [11] use the feature perimeter, red, such as data sets, features texture extraction, and inflammatory
green, blue, intensity1, intensity2, and saturation, which are cells and nuclei classification. Implementation of data test is in
grouped into classes of normal cervical cells, cell LSIL and Section III. Furthermore, the result and discussion are in
HSIL to classify the cell nucleus. Plissiti, et al [10] use three section IV, and the conclusion is drawn in Section V.

978-1-4673-7800-0/15/$31.00 2015 IEEE 65


2015 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME)
Bandung, November 2-3, 2015

II. MATERIAL AND METHODS is the number of different run lengths. Figure 2 is a
A brief description of the experimental data sets are given. representation of matrix GLRL (Grey-Level Run-Length) at 0º
The data is used in this case study in order to prove the validity direction [P (i, j | = 0º)]. In addition, the direction of 0º,
of the proposed approach. GLRL matrix can be also obtained in the direction of 45º, 90º,
A. Data Set or 135º.
In our experiments, we used 117 cytological images of
conventional Pap smears. For the construction of the training
set we had selected 67 images and data test of 50 images. The
training set for texture extraction and establishment of
classification rule used the data as many as 169 cells which
consist of 83 nuclei and 86 inflammatory cells that derived
from a Pap smear slide. The data were taken from 67 images Fig. 2. Directions are available on the GLRL
Pap smear. We used the data test for implementation
TABLE I
performance of 50 images. These images contain 957 candidate EQUATIONS TO CALCULATE THE GLRM FEATURES
of nucleus which namely 122 nuclei and 837 inflammatory
cells. GLRM features Equation
The methods consist of manual nucleus and inflammatory 1. Short Runs Emphasis m n
cells cropping. These cropping were taken from conventional (SRE) SRE = 1/nr ! ! p(i,j) / j
2
Pap smear images with inflammatory cells. The nucleus and
i=1 j=1
inflammatory cells cropping aimed to obtain a separate image
of the inflammatory cells and nucleus. The image of nuclei and 2. Long Runs Emphasis m n
inflammatory cells cropping was taken manually, as illustrated (LRE) LRE = 1/nr ! ! p(i,j) * j
2
in Figure 1.
i=1 j=1

3. Gray Level Non m n


2
uniformity (GLN). GLN = 1/nr ! ( ! p(i,j) )
i=1 j=1

4. Run Percentage (RP) RP = nr / ( p (i,j) * j)

5. Run Length Non- n m


2
uniformity (RLN) RLN = 1/nr ! ( ! p(i,j) )
j=1 i=1

6. Low Gray Level Run m n


2
Emphasis (LGRE) LGRE = 1/nr ! ! p(i,j) / i
i=1 j=1

7. High Gray Level Run m n


Inflammatory cell nucleus HGRE = 1/nr ! ! p(i,j) * i
2
Emphasis (HGRE)
Fig. 1. The example of conventional Pap smear image with inflammatory cells i=1 j=1
and manual cropping
8. Short Run Low Gray- m n
B. Features Texture Extraction 22
Level Emphasis SRLGE = 1/nr ! ! p(i,j) / i j
Each image of 83 nuclei and 86 inflammatory cells from (SRLGE) i=1 j=1
manual nucleus and inflammatory cells cropping (Fig.2) have
RGB color. At this stage all RGB images are converted to 9. Short Run High Gray- m n
grayscale by using rgb2gray function. The grayscale value was Level Emphasis SRHGE = 1/nr ! ! ( p(i,j) i2)/j2
obtained by forming a weighted sum of the R, G, and B (SRHGE) i=1 j=1
components as given in (1).
10. Long Run Low Gray- m n
0.2989R + 0.5870 G + 0.1140 B (1) Level Emphasis LRLGE = 1/nr ! ! ( p(i,j) j2)/i2
(LRLGE) i=1 j=1
The next process is analysis of texture using Grey-Level
Run-Length Matrix (GLRLM). Matrix Gray-Level Run-Length 11. Long Run High Gray- m n
is a two-dimensional matrix in which each element p (i, j | ) is Level Emphasis 22
SRHGE = 1/nr ! ! p(i,j) j i
the intensity i to j number of elements. In the direction , and nr (LRHGE.
i=1 j=1

66
2015 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME)
Bandung, November 2-3, 2015

The GLRLM features were applied in inflammatory cells and image bounding box. Bounding box > 50x50 pixels is to get
nuclei. The complete equation of all statistical measures is as candidate of nucleus. The unsharp masking is used in the
parameter texture analysis by Xu et al., [14], shown in Table I. image sharpening step.
The next phase is creating GLRLM size 8x8 pixel matrix
C. Inflammatory cells and nuclei classification. and calculating the 11 properties GLRLM (grayRLprops) for
The classification process uses decision tree (J48) classifier. direction of 135º. After the texture is obtain, so it is classified
The value of each parameter matrix will be processed for the by using the rule texture of decision tree (J48) for
classification of cell nuclei and inflammatory cells. Figure 3 is inflammatory cells and nuclei classification (Fig.3).
the result of the algorithm decision tree (J48). The rule shows In the last phase, every inflammatory cell color which is
that only 8 texture features were used, namely LGRE, GLN, identified will be changed into the color which is similar to
RLN, LRLGE, SRHGE, SRLGE, LRHGE, and RP. The cytoplasm color whereas nuclei which succeed to be identified
classifications were tried in direction of 0º, 45º, 90º, and 135º, will have unchanging color. Table II shows the algorithm for
and a representation of matrix GLRL in direction of 135º is the feature extraction and classification inflammatory cell and
best result. This result is using the Decision Tree (J48) with nuclei in normal Pap smear images.
Weka Correctly Classification Instances (CCI) and Kappa
Coefficient classification CCI value which indicates TABLE II
ALGORITHM FOR FEATURE EXTRACTION AND CLASSIFICATION
performance of 96.4497% with incorrectly classified Instances INFLAMMATORY CELL AND NUCLEI IN NORMAL PAP SMEAR IMAGES
of 3.5503% with Kappa statistic 0.9289.
J48 pruned tree Pseudo-code
------------------- Input : Pap smear image with inflammatory cells and
LGRE <= 0.273168 Nucleus.
GLN <= 41.814242 : nucleus (47.0/1.0) Output: Pap smear image with inflammatory cells eliminated
GLN > 41.814241
and nucleus detected.
RLN <= 198.044898 : inflammatory (12.0/3.0)
RLN <= 198.044898
1. Convert image from RGB to grayscale.
GLN <= 92.526154 : nucleus (2.0) 2. Improve the image of the adjustment and filter unsharp.
GLN > 92.526154 : inflammatory (4.0/1.0) 3. Use the global threshold 0.07 to get the image bounding
LGRE > 0.273168 box.
LRGE <= 3.868303 4. if bounding box> 50x50 pixels then the candidate
SRHGE <= 64.026621 nucleus.
SRLGE <= 0.051582 5. Create GLRLM size 8x8 pixel matrix.
LRHGE <= 8011.635671 : nuclei (6.0/1.0)
6. Calculate the 11 properties GLRLM (grayRLprops) for
LRHGE > 8011.635671 : inflammatory (5.0)
SRLGE > 0.051582 : nucleus (5.0)
direction of 135º.
SRHGE > 64.026621 : inflammatory (8.0) 7. Use the rule texture of decision tree (J48) for
LRGE > 3.868303 inflammatory cells and nuclei classification.
RP <= 1.1125 8. Define nuclei and eliminate inflammatory cells with
SRHGE <= 41.59111 : inflammatory (9.0) changing their color into a color that is similar to the
SRHGE > 41.59111 color of the cytoplasm.
SRHGE <= 52.54535 : nucleus (3.0)
SRHGE > 52.54535 : inflammatory (2.0) IV. RESULST AND DISCUSSION
RP > 1.1125 : inflammatory (48.0)

Number of Leaves : 12 Numerical evaluation is carried out for texture analysis on


Size of the tree : 23 inflammatory cells 86 and 83 nuclei. In the process of object
recognition inflammatory cells and cell nuclei obtained 8
Fig. 3. The rule texture of decision tree (J48) for inflammatory cells and nuclei. GLRM features that can distinguish among them. These
features are LGRE, GLN, RLN, LRLGE, SRHGE, SRLGE,
IV. IMPLEMENTATION OF DATA TEST LRHGE, and RP while the other three features GLRM which
The first stage in implementation of data test is converting are namely SRE, LRE and HGRE were not used.
the RGB image to grayscale without changing the contrast of The three features that are not used can be analyzed that for
the original image. At this stage the conversion result image the value of SRE and LRE, inflammatory cells and nuclei
has low contrast, and further complicates the process. Each cannot be distinguished from the value of which depends on
image has a different size and it has each RGB color. At this the short runs and long runs structure of the rough structure.
stage all RGB images are converted to grayscale using The mean value of SRE for both groups of data are an
rgb2gray function. average of 0.18. Mean value and standard deviation of SRE on
In the second phase, image adjustment and filtering are used nucleus group are 0.18 ± 0.07. This value is almost equal to the
to enhance the contrast of the image, resulting in cell mean value of SRE on inflammatory cells of 0.18 ± 0.05.
boundaries that are clearly distinguishable from the
background of the image with global threshold 0.07 to get the

67
2015 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME)
Bandung, November 2-3, 2015

TABLE III The dataset used in feature extraction and making rule
THE TEXTURE OF NUCLEI
classification phase is different from classification phase. This
GLRLM Mean St Dev Range Min Max is conducted to test the robustness of the features used for the
Features new data. We tested the classification stage on 50 images.
SRE 0.18 0.07 0.3 0.05 0.35 These images contain 957 candidate of nucleus namely 122
LRE 31.34 4.419 25.71 21.33 47.04 nuclei and 837 inflammatory cells. The calculation of the value
GLN 46.27 23.58 105.7 11.31 117
RLN 215.64 74.08 320.83 105.61 426.45
of True Positive (TP), True Negative (TN), False Positive (FP),
RP 0.99 0.23 1.21 0.55 1.76 and False Negative (FN) used two statistical calculations for
LGRE 0.2 0.08 0.37 0.03 0.4 performance. Sensitivity which was calculated the proportion
HGRE 291.71 172.03 812.5 61.593 874.09 of nuclei were detected correctly and defined as:
SRLGE 0.031 0.017 0.083 9E-04 0.084
SRHGE 68.78 63.18 425.6 6.978 432.6
LRLGE 2.5 0.94 3.82 0.83 4.65
Sensitivity = TP / (TP + FN) x 100% (2)
LRHGE 10532.77 5523.99 23898.99 1866.51 25765.5
Specificity that calculates the proportion of inflammatory cells
TABLE IV
THE TEXTURE OF INFLAMMATORY CELLS
was detected correctly and defined as:

GLRLM Mean St Dev Range Min Max Specificity = TN / (TN+FP) x 100 % (3)
Features
SRE 0.18 0.05 0.22 0.06 0.28
The specificity of 75.04% indicates that inflammatory cells
LRE 27.55 2.756 14.61 21.59 36.2
GLN 96.32 40.71 239.1 21.84 260.9 detected properly and few inflammatory cells are considered as
RLN 175.15 63.639 371.17 54.216 425.39 the nucleus cell while the value of sensitivity of 49.82%
RP 1.29 0.26 1.17 0.76 1.93 indicates that there is still a nucleus of cells which are
LGRE 0.35 0.08 0.35 0.16 0.5 considered as inflammatory cells. It is in accordance with the
HGRE 173.98 98.14 482.95 27.418 510.37 conditions of the difficulties faced by the pathologist.
SRLGE 0.05 0.02 0.07 0.02 0.09
SRHGE 40.42 30.11 125.6 3.306 128.9
LRLGE 4.45 1.4 5.83 1.66 7.49
LRHGE 8007.21 5573.54 27095.52 795.95 27891.47

There is a significant difference between the cell nuclei and


inflammatory cells in LRHGE features, namely the Long Run
High Gray-Level Emphasis, a feature GLRLM used to measure
the distribution of distribution relationships run and high-value
gray level. The overall distribution of datum values for the
texture GLRLM groups of nuclei and inflammatory cells can (a) (b)
be seen in Table III and IV.
In figure 4, eleventh value GLRM of inflammatory cells and
nuclei have the same size, and the line on the graph of the two
sets of data is almost coincide. To compare the average value
for the eleventh texture of nuclei and inflammatory cells, the
mean or average for the two groups of data are graphed in
Figure 4.

(a) (b)

(a) (b)
Fig. 5. a) Initial image, (b) The resulted image with multiple segmented
inflammatory cell

Figure 5 (a) shows several the initial image in normal Pap


Fig. 4. Mean of Texture Nucleus and Inflammatory Cells
smear and figure 5 (b) shows the nuclei with centroid cell and

68
2015 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME)
Bandung, November 2-3, 2015

these inflammatory cells will be extracted and eliminated by Smear Images,” International Journal of E-Health and Medical
Communications (IJEHMC), ISSN: 1947315x, 19473168, Vol:6, Issue:2,
changing their color into a color that is similar to the color of 27-43, April-June 2015.
the cytoplasm. The detected cell nucleus is remaining with its [14] D. Xu, A.S. Kurani, J.D. Furst, and D.S. Raicu,“Run-length Encoding for
original color. Figure 5(b) shows the result of this procedure in Volumetric Texture,” The 4th IASTED International Conference on
an original image. In this figure, we can see that the Visualization, Imaging, and Image Processing, 2004.
inflammatory cells in this image are suppressed.

V. CONCLUSION

The identification of nucleus in the Pap smear images is a


challenging issue mainly these images which present
inflammatory cell and other limitations. In this paper, we
propose the rule classification of texture GLRLM that can be
used to extract the nucleus cells and inflammatory cells in
normal Pap smear images, which results in efficient
discrimination with a Decision Tree (J48) algorithm.
Meanwhile, the percentage of specificity indicates that the
proposed method can be used in future studies to perform the
classification of inflammatory cell because only few
inflammatory cells are not detected. However, further research
will be done by finding a classification rule nucleus and
inflammatory cells obtained from cropping automatically, to
obtain better results.
ACKNOWLEDGMENT
Dwiza Riana would like to thank Laboratorium Khusus
Patologi Veteran, Bandung, Indonesia for the database Pap
smear images.
REFERENCES
[1] A. Jemal, et al, Global Cancer Statistics, CA Cancer J Clin 61(2), pp. 69–
90, 2011
[2] A. Kale, and S. Aksoy,”Segmentation of Cervical Cell Images,“
International Conference on Pattern Recognition, IEEE, 2010.
[3] P. Bamford and B. Lovell, “A water immersion algorithm for cytological
image segmentation,” in Proc. APRS Image Segmentation Workshop,
Sydney, Australia, pp. 75–79, 1996.
[4] P. Bamford and B. Lovell, “Unsupervised cell nucleus segmentation with
active contours,” Signal Process, vol. 71, no. 2, pp. 203–213, 1998.
[5] H. S. Wu, J. Barba, and J. Gil, “A parametric fitting algorithm for
segmentation of cell images,” IEEE Trans. Biomed. Eng., vol. 45, no. 3,
pp. 400–407, Mar. 1998.
[6] A. Garrido and N. P. de la Blanca, “Applying deformable templates for cell
image segmentation,” Pattern Recognition., vol. 33, no. 5, pp. 821–832,
2000.
[7] N. Lassouaoui and L. Hamami, “Genetic algorithms and multifractal
segmentation of cervical cell images,” in Proc. 7th Int. Symp. Signal
Process. Appl, vol. 2, pp. 1–4, 2003.
[8] E. Bak, K. Najarian, and J. P. Brockway, “Efficient segmentation
framework of cell images in noise environments,” in Proc. 26th Int. Conf.
IEEE Eng. Med. Biol, vol. 1, pp. 1802–1805, September 2004.
[9] N. A. M. Isa, “Automated edge detection technique for Pap smear images
using moving K-means clustering and modified seed based region growing
algorithm,” Int. J. Comput. Internet Manag., vol. 13, no. 3, pp. 45–59,
2005.
[10] M.E. Plissiti, C. Nikou and A. Charchanti, “Combining Shape, Texture
And Intensity Features For Cell Nuclei Extraction In Pap Smear Images, “
Pattern Recognition Letters, vol.32, No 6, pp 838 – 853, 2011.
[11] N. Mustafa, M.N.A Isa, and M.Y. Mashor, “Automated Multicells
Segmentation of Thinprep® Image Using Modified Seed Based Region
Growing Algorithm,” Biomed Soft Comput Hum Sci. 14 (2), 41–47, 2009.
[12] I. Muhimmah, R. Kurniawan, and Indrayanti, “Analysis of Features to
Distinguish Epithelial Cells and Inflammatory Cells in Pap smear Images,”
6th International Conference on Biomedical Engineering and Informatics
(BMEI 2013), 122-127, 2013.
[13] D.Riana, M. E Plissiti, C. Nikou, D.H.Widyantoro, T.L.R. Mengko, O.
Kalsoem, “Inflammatory Cell Extraction And Nuclei Detection In Pap

69

You might also like