Professional Documents
Culture Documents
NITHEESHA.S (1JB14LVS07)
Fig 2: COS algorithm comprises two steps: blockwise segmentation and global
segmentation. The parameters of the cost function used in the global segmentation
are optimized in an offline training procedure.
Fig 3. Illustration of a blockwise segmentation. The pixels in each block are separated
into foreground (1) or background (0) by comparing each pixel with a threshold t.
The threshold t is then selected to minimize the total subclass variance.
component inversion,
component classification
A. Preprocessing
For consistency all scanner outputs were converted to sRGB
color coordinates [49] and descreened before segmentation . The scanned
RGB values were first converted to an intermediate device-independent color
space, CIE XYZ, then transformed to sRGB .
Fig. e. Text regions in the binary mask. The region is 165 370 pixels at 400 dpi, which
corresponds to 1.04 cm 2.34 cm. (a) Original test image. (b) Ground truth
segmentation. (c) Otsu/CCC. (d) Multiscale-COS/CCC/Zheng. (e) DjVu. (f) LuraDocument.
(g) COS. (h) COS/CCC. (i) Multiscale-COS/CCC.
[1] ITU-T Recommendation T.44 Mixed Raster Content (MRC), T.44, International
Telecommunication Union, 1999.
[2] G. Nagy, S. Seth, and M. Viswanathan, A prototype document image
analysis system for technical journals, Computer, vol. 25, no. 7, pp.
1022, 1992.
[3] K. Y.Wong and F. M.Wahl, Document analysis system, IBM J. Res.
Develop., vol. 26, pp. 647656, 1982.
[4] J. Fisher, A rule-based system for document image segmentation, in
Proc. 10th Int. Conf. Pattern Recognit., 1990, pp. 567572.
[5] L. OGorman, The document spectrum for page layout analysis,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 15, no. 11, pp.
11621173, Nov. 1993.