You are on page 1of 40

International Journal of Pattern Recognition

and Articial Intelligence


Vol. 28, No. 1 (2014) 1450001 (40 pages)
#
.c World Scientic Publishing Company
DOI: 10.1142/S0218001414500013

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

GRAPH MATCHING AND LEARNING IN PATTERN


RECOGNITION IN THE LAST 10 YEARS

PASQUALE FOGGIA*, GENNARO PERCANNELLA and MARIO VENTO


Department of Information Engineering
Electrical Engineering and Applied Mathematics
University of Salerno
Via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
*pfoggia@unisa.it

pergen@unisa.it

mvento@unisa.it
Received 25 June 2013
Accepted 14 October 2013
Published 9 December 2013
In this paper, we examine the main advances registered in the last ten years in Pattern
Recognition methodologies based on graph matching and related techniques, analyzing more
than 180 papers; the aim is to provide a systematic framework presenting the recent history and
the current developments. This is made by introducing a categorization of graph-based techniques and reporting, for each class, the main contributions and the most outstanding research
results.
Keywords : Structural pattern recognition; graph matching; graph kernels; graph embeddings;
graph learning; graph clustering; graph and tree search strategies.

1. Introduction
Structural Pattern Recognition bases its theoretical foundations on the decomposition of objects in terms of their constituent parts (subpatterns) and of the relations
among them. Graphs, usually enriched with node and edge attributes, are the elective data structures for supporting this kind of representations. Some of the methods
working on graphs introduce some restrictions on the structure of the graphs (e.g.
only allowing planar graphs) or on the kind of attributes (e.g. some methods only
allow single real-valued attributes for the graph edges).
The use of a graph-based pattern representation induces the need to formulate
the main activities required for Pattern Recognition in terms of operations on
graphs: classication, usually intended as the comparison between an object and a
Corresponding

author.
1450001-1

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

set of prototypes, and learning, which is the process for obtaining a model of a class
starting from a set of known samples, are among the key issues that must be
addressed using graph-based techniques.
The use of graphs in Pattern Recognition dates back to the early seventies, and
the paper \Thirty years of graph matching in Pattern Recognition"26 reports a
survey of the literature on graph-based techniques since the rst years and up to the
early 2000's. We have surely assisted to a maturation of the classical techniques for
graph comparison, either exact or inexact; at the same time we are assisting to a
rapid growth of many alternative approaches, such as graph embedding and graph
kernels, aimed at making possible the application to graphs of vector-based techniques for classication and learning (such as the ones derived from the statistical
classication and learning theory).
In this paper, we discuss the main advances registered in graph-based methodologies in the last 10 years, analyzing more than 180 papers on this topic; the aim is
to provide a systematic framework presenting the recent history of graphs in Pattern
Recognition and the current trends.
Our analysis starts from the above mentioned survey26 and completes its contents
by considering a selection of the most recent main contributions; consequently, the
present paper, for the sake of conciseness, reports only references to works published
during the last 10 years. The reader is kindly invited to consult Ref. 26 for recovering
the previous related works. However, the taxonomy of the papers presented in
Ref. 26 has been extended with other graph-based problems that are related to
matching, either because they involve some form of graph comparison, or because
they use a graph-based approach to group patterns into classes. Figure 1 shows a
graphical representation of the taxonomy adopted in this paper.
In fact, in the last decade we have assisted to the birth and growth of methods
facing learning and classication in a rather innovative scientic vision: the
computational burden of matching algorithms together with their intrinsic complexity, in opposition to the well-established world of statistical Pattern Recognition
methodologies, suggested new paradigms for the graph-based methods. Why do not
we try to reduce graph matching and learning to vector-based operations, so as to
make it possible the use of statistical approaches?
Two opposite ways of facing the problem, each with its pros and cons: \graphs
from the beginning to the end", with a few heavy algorithms, but the exploitation of
all the information contained in the graphs; on the other side, the risk of loosing
discriminating power during the conversion of graphs into vectors (by selecting
suitable properties), counterbalanced by the immediate access to all the theoretically
assessed achievements of the statistical framework. In a sense, there are some traditional tools that can be considered to be halfway between these two approaches: an
example is Graph Edit Distance (GED), that is based on a matching between the
nodes and the edges of the two graphs, but produces a distance information that can
be used to cast the graphs into a metric space. However, GED can still be considered
1450001-2

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

an approach of the rst kind, since in the computation of the metric, the information
attached to the subparts can be considered in a context-dependent way, and has not
to be reduced a priori to a vectorial form.
These two opposite factions are now simultaneously active, each hoping to
overcome the other; 10 years ago these innovative methods were in the background,
but now they are gaining more and more attention in the scientic literature on
graphs. This is the reason why the categorization reported in this paper has
been further expanded by including a new section describing a variety of novel
approaches, such as graph embedding, graph kernels, graph clustering and graph
learning, dedicating a subsection to each of them. It is worth pointing out that these
methods were of course already known at the time of Ref. 26, but their diusion and
scientic interest has shown a signicant growth in the last decade. For instance, a
recent survey by Hancock and Wilson71 compare and contrast the work on graphbased techniques by the Bern group led by Horst Bunke and the York group led by
Edwin Hancock. The rst group has historically put more emphasis on the purely
structural aspects of graph-based techniques, while the second has focused on the
extensions to the graph domain of probabilistic and information theoretic methodologies; however, both the schools in the last decade have found a point of convergence in the adoption of graph kernels and graph embedding techniques. Another
recent paper by Livi and Rizzi98 present a survey of graph matching techniques.
However, despite its title, it is mostly dedicated to graph embeddings and graph
kernels, and does not aim to cover comprehensively the graph matching techniques;
furthermore the paper is less specically devoted to approaches used within the
Pattern Recognition community.
The overall organization of our paper is based on a categorization of the
approaches with respect to the problem formulation they adopt, and secondarily to
the kind of technique used to face the problem, following the taxonomy reported in
Fig. 1. We have distinguished between graph matching problems, that will be presented in Sec. 2, and other problems related to graph comparison, that are discussed
in Sec. 3. In particular, the section on graph matching is divided between exact and
inexact matching techniques. The section on other problems is articulated according
to the techniques that have obtained most attention in recent literature, namely
graph embedding, graph kernels, graph clustering and graph learning with a miscellaneous problems subsection for less common but related problems.
For reasons of space, in this survey we have focused on the algorithms and not on
their applications. The interested reader may nd some complementary surveys on
the applications of graph matching to Computer Vision and Pattern Recogniton, in
Refs. 28 and 53. For the very same reasons, we have not included research papers
from outside of the Pattern Recognition community. Graph-based methods are used
and investigated in many other research elds; among them, we can mention, with no
pretense at completeness, Data Mining, Machine Learning, Complex Networks
Analysis and Bioinformatics.
1450001-3

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

Fig. 1. A graphical representation of the adopted categorization of the considered graph-based techniques. The techniques in the gure have been chosen because they either involve some kind of graph
comparison, or use a graph-based approach to group objects into classes.

2. Graph Matching
We recall briey the terminology used in our previous survey.26 Exact graph
matching is the search for a mapping between the nodes of two graphs which is edgepreserving, in the sense that if two nodes in the rst graph are linked by an edge, the
corresponding nodes in the second graph must have an edge, too. Several variants of
exact matching exist (e.g. isomorphism, subgraph isomorphism, monomorphism,
homomorphism, maximum common subgraph) depending on whether this constraint
must hold in both directions of the mapping or not, if the mapping must be injective
and if the mapping must be surjective.
More formally, given two graphs G1 V1 ; E1 and G2 V2 ; E2 (where V and E
are the sets of nodes and edges, respectively), a mapping is a function  : V1 ! V2 . A
mapping  is edge preserving i:
8v; w 2 V1 ; v; w 2 E1 ) v; w 2 E2 _ v w:

An edge preserving mapping is also called a homomorphism. A monomorphism, also


called an edge-induced subgraph isomorphism, is a homomorphism  that is also
injective:
8v 6 w 2 V1 ;

v 6 w:

A graph isomorphism is a monomorphism  that is bijective, and whose inverse


mapping  1 is also a monomorphism:

8v2 2 V2 ; 9v1  1 v2 2 V1 : v2 v1 :
3
 1 is a monomorphism
1450001-4

Graphs in Pattern Recognition in the Last 10 Years

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

A mapping  is a subgraph isomorphism, that some authors call a node-induced


subgraph isomorphism, if there is a (node-induced) subgraph of G 02 of G2 such that 
is an isomorphism between G1 and G 02 . More formally:
8 0
V 2  V2 fv2 2 V2 : 9v1 2 V1 : v2 v1 g
>
>
<
4
E 20  E2 E2 \ V 20  V 20
>
>
:
 is an isomorphism between G1 and G 02 V 20 ; E 20 :
Finally, the maximum common subgraph problem is the search of the largest subgraph of G1 that is isomorphic to a subgraph of G2 (and usually, of the corresponding
mapping between the two subgraphs).
In inexact graph matching, instead, the constraints on edge preservation are
relaxed, either because the algorithms attempt to deal with errors in the input graphs
(and so we have error-correcting matching) or because, for reducing the computational cost, they search the mapping with a strategy that does not ensure the optimality of the found solutions (approximate or suboptimal matching).
For inexact matching, there is not a single formal statement of the problem;
instead, dierent papers often use slightly dierent formalizations, that may lead to
dierent ways of relaxing the edge preservation constraints. With no pretense at
completeness, in the following we will describe two formalizations that have been
used by several works.
In the rst denition, the concept of a mapping function  is extended so as to
include the possibility of mapping a node v to a special, null node denoted as ; thus
the mapping is a function  : V1 ! V2 [ fg. We will assume that  is injective for the
nodes of V1 not mapped to ,
8v 6 w 2 V1 ;

v 6  ) v 6 w

while allowing that several nodes may be mapped to . With a slightly improper
notation, we will say that  1 w  to indicate that there is no node v 2 V1 such
that v w.
Then, the cost of a mapping  is dened as:
X
X
X
C
CR v; v
CD v
CD w
v2V1
v6

v2V1
v 

w2V2
1 w 

C R0 v; w; v; w

v; w2E1
v; w2E2

0
CD
v; w

v; w2E1
v; w62 E2

0
CD
v; w;

v; w2E2
1 v; 1 w62E1

where CR : ; : is the cost for the replacement of a node, CD  is the cost for the
0
deletion of a node, and C R0 and C D
are the replacement and deletion costs for edges.
1450001-5

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

These cost functions are to be dened according to the application requirements, and
usually take into account additional, application-dependent attributes that are attached to nodes and edges.
In this formulation, the matching problem is cast as the search of the matching
that minimizes the cost C. With an appropriate choice of the cost functions, it
can be demonstrated that the exact matching problems dened previously can be
seen as special cases of this one, with the additional requirement that the matching
cost must be 0.
In the second formulation, called weighted graph matching, the graphs are
represented through their adjacency matrices; usually the elements of the matrices
are not restricted to 0 and 1, but can express a continuous weight for the relation
between two nodes: so the generic element Aij of the matrix A is 0 if there is not an
edge between nodes i and j, and has otherwise a real value in 0; 1 denoting the
weight for the edge i; j.
Given two graphs represented by their adjacency matrices A and B, a compatibility tensor Cijkl is introduced to measure the compatibility between two
edges:

Cijkl

if Aij 0 or Bij 0;
otherwise

0
cAij ; Bkl

where c: ; : is a suitably dened compatibility function. The matching is represented by a matching matrix M, whose elements Mik are 1 if node i of the rst
graph is matched with node k of the second graph, 0 otherwise. Thus the matching
problem is formulated as the search of the matrix M that maximizes the following
function:
XXXX
W M
Mik  Mjl  Cijkl :
8
i

subject to the constraints:


Mik 2 f0; 1g;

8i;

X
k

Mik  1;

8k;

X
i

Mik  1:

Also with this formulation, it can be demonstrated that with a suitable choice of
the compatibility function c: ; :, the various forms of exact matching can be seen
as a special case.
While in the years covered by Ref. 26 the research has explored both exact and
inexact matching, the recent work on graphs in the Pattern Recognition community
has been mostly focused on inexact graph matching. This may be due to the fact that
today the Pattern Recognition research is applying graphs to more complex problems than those that were feasible some years ago, and so it is more frequent the use
of larger and noisier graphs.
1450001-6

Graphs in Pattern Recognition in the Last 10 Years

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

2.1. Exact matching


While there has been little work on the overall to improve existing exact matching
algorithms, some eort has been put to provide a better characterization of the
existing methods. As an example, the 2003 paper by De Santo et al.138 presents an
extensive comparative evaluation of four exact algorithms for graph isomorphism
and graph-subgraph isomorphism.
Most existing exact matching algorithms are based on some form of tree search,
where the matching is constructed starting with an empty mapping function and
adding a pair of nodes at a time, usually with the possibility of backtracking, and the
use of heuristics to avoid the complete exploration of the space of all the possible
matchings. In 2007, Konc and Janei87 propose MaxCliqueDyn, an improved algorithm for nding the Maximum Clique (and hence the Maximum Common Subgraph) which uses branch and bound, combined with approximate graph coloring for
nding tight bounds in order to prune the search space. In a 2011 paper, Ullmann160
presents a substantial improvement of his own very well-known subgraph isomorphism algorithm from 1976. The new algorithm incorporates several ideas from the
literature on the Binary Constraint Satisfaction Problem, of which the subgraph
isomorphism can be considered a special case. Also Zampelli et al.179 propose a
method based on Constraint Satisfaction, which is an extension of the technique
introduced by Larrosa and Valiente in 2002.90 A further development of the technique, with the introduction of a better ltering based on the AllDierent constraint,
is proposed by Solnon152 in 2010.
Among the approaches not based on tree search, we can mention Gori et al.,64
who, in their 2005 paper, propose an isomorphism algorithm that is based on Random Walks, that works only on a class of graphs denoted by the authors as Markovian Spectrally Distinguishable graphs; the authors verify experimentally on a large
database of graphs that, as long as the graphs have some kind of irregularity or
randomness, the probability of not satisfying this assumption is very low. The 2011
paper by Weber et al.169 extends the matching algorithm based on the construction
of a decision tree by Messmer and Bunke,109 signicantly reducing the spatial
complexity for graphs whose nodes have a small number of dierent labels. In their
2004 paper,39 Dickinson et al. discuss the matching problem (graph isomorphism,
graph-subgraph isomorphism and maximum common subgraph) for the special case
of graphs having unique node labels. Finally, the 2012 paper by Dahm et al.34 present
a technique for speeding up existing exact subgraph isomorphism algorithms on large
graphs.
2.2. Inexact matching
Inexact matching methods have received comparatively more attention in the research community, both by extending existing techniques and by introducing novel
ideas unrelated to previous work. In particular, the extensions of previous methods
have interested mostly algorithms based on the reduction of graph matching to a
1450001-7

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

continuous optimization problem, algorithms based on spectral properties of the


graphs (i.e. properties related to the eigenvalues and eigenvectors of the adjacency
matrix or of other matrices characterizing the graph structure), and methods approximating the solution of the graph matching problem by means of the bipartite
graph matching, which is a simpler problem solvable in polynomial time.
Many inexact matching algorithms are formulated as an approximate way to
compute the GED. A recent paper by Gao et al.57 in 2010 presents a survey on this
topic. GED computes the distance between two graphs on the basis of the minimal
set of edit operations (e.g. node additions and deletions, etc.) needed to transform
one graph into the other one. A 2012 paper by Sole-Ribalta et al.151 provides a
theoretical discussion on the relation between the properties of the distance function
and the costs assigned to each edit operation.
Although in principle the GED problem is not related to matching, in practice
most methods compute the distance by nding a matching for the nodes that are
preserved by the edit operations (i.e. those that are not added or removed, but
possibly have their label changed); given this matching, the edit distance can be
obtained as the sum of a term accounting for the matched nodes and their edges, and
a term accounting for the remaining nodes/edges (see Eq. (6)). So usually the outcome of the algorithm is not only an indication of the distance between the graphs,
but also the matching that is supposed to minimize the value of this distance. This is
why we have chosen to include some GED methods in this section.
2.2.1. Techniques based on tree search
Methods based on tree search have been also used for inexact matching. In this case,
the adopted heuristics may not ensure that the optimal solution is found, yielding a
suboptimal matching. As an example, Sanfeliu et al.135,136 and Serratosa et al.,142
extend their previous work on inexact matching of Function-Described Graphs
(FDG), that are Attributed Relational Graphs enriched with constraints on the joint
probabilities of nodes and edges, used to represent a set of graphs, while in Ref. 141,
Serratosa et al. detail how these FDG can be automatically constructed. Cook et al.29
in 2003 propose the use of beam search, a heuristic search method derived from the
A* algorithm, for computing the GED. The paper by Hidovi and Pelillo73 in 2004
extends the denition of a graph metric based on Maximum Common Subgraph,
introduced by Bunke in 1999, so that it can also be applied to graphs with node
attributes.
2.2.2. Continuous optimization
While graph matching is inherently a discrete optimization problem, several inexact
algorithms have been proposed to reformulate it as a continuous problem (by
relaxing some constraints), solve the continuous problem using one of the many
available optimization algorithms and then recast the found solution in the discrete
domain. Usually the algorithm used for the continuous problem only ensures that a
1450001-8

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

local optimum is found; moreover, since a discretization step is required afterwards,


the matching found may not even guarantee to exhibit local optimality.
An example of evolution of an existing matching method of this category is given
by the 2003 paper by Massaro and Pelillo,107 which improves a previous work on the
search of the Maximum Common Subgraphs that use a theorem by Bomze to reformulate this problem as a quadratic optimization in a continuous domain.
Zaslavskiy et al.182 in their 2009 paper present a graph matching algorithm in which
the matching is formulated as a convex-concave programming problem which is
solved by interpolating between two approximate simpler formulations. Also the
2011 paper by Rota Bul et al.134 is based on the same formulation of graph
matching; in this case the authors solve the quadratic optimization problem using
infection-immunization dynamics, a new iterative algorithm based on evolutionary
game theory. The 2002 paper by van Wyk et al.164 addresses the problem of Attributed Graph matching as a parameter identication problem, and propose the use
of a Reproducing Kernel Hilbert Space interpolator (RKHS ) to solve this problem.
The 2003 paper by van Wyk and van Wyk161 extends the previous method by
providing a more general formulation of the problem. The same authors in a 2004
paper163 further generalize the method, presenting a kernel-based framework for
graph matching which include as special cases the previous two algorithms. In 2004,
van Wyk and van Wyk162 present a graph matching algorithm based on the Projections Onto Convex Sets approach. The 2006 paper by Justice and Hero81 proposes
a reformulation of the GED as Binary Linear Programming problem, for which
they provide upper and lower bounds in polynomial time. Kostin et al.89 in 2005
present an extension of the probabilistic relaxation algorithm by Christmas et al.25
Chevalier et al.22 propose in a 2007 paper a technique that integrates probabilistic
relaxation with bipartite graph matching, applied to Region Adjacency Graphs. In
their 2008 paper,157 Torresani et al. introduce an algorithm based on a technique
called dual decomposition: the matching problem (in a continuous reformulation) is
decomposed into a set of simpler problems, depending on a parameter vector; the
simpler problems can be solved providing a lower bound to the minimization of the
functional to be optimized. Then the algorithm searches for the tightest bound by
varying the parameter vector. Caetano et al.19 propose in 2009 a technique in which
the functional to be optimized has a parametric form, and the authors propose a
training phase to learn these parameters. In a 2011 paper,21 Chang and Kimia
present an extension of the Graduated Assigment Graph Matching by Gold and
Rangarajan, modied so as to work on hypergraphs instead of graphs. Zhou and De
la Torre184 present a method called factorized graph matching in which the anity
matrix used to dene the functional to be optimized is factored into a Kronecker
product of smaller matrices, separately encoding the structure of the graphs and the
anities between nodes and between edges. The authors propose an optimization
method based on this factorization that leads to an improvement in space and time
requirements.
1450001-9

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

Sanrom et al.137 in 2012 propose a special purpose, probabilistic graph matching


method for graphs representing sets of 2D points, based on the Expectation
Maximization (EM) algorithm.
Sole-Ribalta and Serratosa149 in their 2011 paper propose two sub-optimal
algorithms for the common labeling problem, a generalization of inexact graph
matching in which the number of graphs is larger than two (the problem cannot be
reduced to several pairwise matchings). The rst proposed algorithm uses an extension of Graduated Assignment, while the second is based on a probabilistic formulation and adopts an iterative approach somewhat similar to Probabilistic
Relaxation. A 2011 paper by Rodenas et al.131 presents a parallelized version of the
rst algorithm. A 2013 paper by Sole-Ribalta and Serratosa150 present a further
development of the rst algorithm, based on the matching of the nodes of all graphs
to a virtual node set.
2.2.3. Spectral methods
Spectral matching methods are based on the observation that the eigenvalues of a
matrix do not change if the rows and columns are permuted. Thus, given the matrix
representations of two isomorphic graphs (for instance, their adjacency matrices),
they have the same eigenvalues. The converse is not true; so, spectral methods are
inexact in the sense that they do not ensure the optimality of the solution found.
Caelli and Kosinov16,18 in 2004 propose a matching algorithm that uses the graph
eigenvectors to dene a vector space onto which the graph nodes are projected; a
clustering algorithm in this vector space is used to nd possible matches. Also
Robles-Kelly and Hancock,130 in a 2007 paper, propose the embedding of graph nodes
into a dierent space (a Riemannian manifold) using spectral properties. The 2004
and 2005 papers by Robles-Kelly and Hancock,128,129 present a graph matching
approach based on Spectral Seriation of graphs: the adjacency matrix is transformed
into a sequence using spectral properties, then the matching is performed by computing the String Edit Distance between these sequences. Cour et al.31 in 2007
propose a spectral matching method called balanced graph matching, using a novel
relaxation scheme that naturally incorporates matching constraints. The authors
also introduce a normalization technique that can be used to improve several other
algorithms such as the classical Graduated Assignment Graph Matching by Gold
and Rangarajan. Cho et al.23 propose a reformulation of the inexact graph matching
as a random walk problem, and show that this formalization provides a theoretical
interpretation of both spectral methods and of some other techniques based on
continuous optimization; in this framework, the authors present an original algorithm based on techniques commonly used for Web ranking.
In a 2006 paper, Qiu and Hancock118 present an approximate, hierarchical
method for graph matching that uses spectral properties to partition each graph into
nonoverlapping subgraphs, which are then matched separately, with a signicant
reduction of the matching time. The same authors present a somewhat similar idea in
1450001-10

Graphs in Pattern Recognition in the Last 10 Years

a 2007 paper,119 where the partition is based on commute times, which can be
computed from the Laplacian spectrum of the graph. Wilson and Zhu171 in their 2008
paper present a survey of dierent techniques for the spectral representation of
graphs and trees. In 2011, Escolano et al.45 propose a matching method based on the
representation of a graph as a bag of partial node coverages, described using spectral
features. In 2011, Duchenne et al.40 present a generalization of spectral matching
techniques to hypergraphs, using some results from tensor algebra.

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

2.2.4. Other approaches


Among other techniques used for inexact matching, Bagdanov and Worring2 in a
2003 paper introduce a matching algorithm based on bipartite matching for the socalled First Order Gaussian Graphs (FOGG ), which are an extension of random
graphs having Gaussian random variables as their node attributes. Also the paper by
Skomorowski147 in 2007 presents a pattern recognition algorithm based on a variant
of random graphs, using for the matching a syntactic approach based on graph
grammars. The 2003 paper by Park et al.116 addresses the problem of partial
matching between a model graph and a larger image graph by combining a probabilistic formulation similar to the one used in probabilistic relaxation with a greedy
search technique. In a 2006 paper, Conte et al.27 present an inexact matching
technique for pyramidal graph structures, which is based on weighted bipartite graph
matching, but use information from the upper levels of a pyramid to constrain the
matching of the lower levels. Xiao et al.174 in 2008 propose a graph distance based on
a vector representation called Substructure Abundance Vector (SAV ), that can be
considered as an extension of the graph distance based on Maximum Common
Subgraph (MCS). The paper by Auwatanamongkol1 in 2007 proposes a genetic
algorithm for a special case of inexact matching, where the nodes are associated to 2D
points. Bourbakis et al.9 in 2007 introduce the so-called Local-Global graphs (L-G
graphs), as an extension of Region Adjacency graphs in which the edges are obtained
through a Delaunay triangulation, for which they introduce an inexact, suboptimal
matching algorithm which is based on a greedy search. In 2002, Wang et al.168
present a polynomial algorithm for the inexact graph-subgraph matching for the
special case of undirected acyclic graphs. The 2004 paper by Sebastian et al.140
presents a GED algorithm for the special case of shock graphs, based on dynamic
programming. In their 2008 paper,4 Bai and Latecki propose an inexact suboptimal
matching algorithm for skeleton graphs, based on the use of bipartite graph
matching. Chowdury et al.24 in a 2009 paper combine weighted bipartite graph
matching with the use of the automorphism groups for the cycles contained in the
graph, to improve the accuracy of the matching found. A 2009 paper by Riesen and
Bunke125 proposes an approximation of GED with the use of Bipartite Graph
Matching, solved using the Munkres' algorithm. The 2010 paper by Kim et al.83
approximates the matching between Attributed Relational Graphs using the nested
assignment problem: an inner assigment step is used to nd the best matching of the
1450001-11

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

adjacent edges; this information is used then to dene a matching cost for the nodes,
and an outer assigment step nds the node matchings that minimizes the sum of
these costs. This double application of the assignment problem is the original aspect
of this method, dierentiating it, for instance, from the heuristic proposed by Riesen
and Bunke. Also Raveaux et al.121 in 2010 present an approximate algorithm based
on bipartite graph matching; in this case the aim is to compute an approximation of
the GED, and the bipartite matching is performed between small subgraphs of each
of the two graphs. In 2011, Fankhauser et al.46 present an algorithm for computing
the GED using bipartite graph matching, solved using the algorithm by Volgenant
and Jonker. The same authors in 201247 propose an suboptimal technique for graph
isomorphism, also based on bipartite graph matching. The algorithm has the distinctive feature that it either nds an exact solution, or it rejects the pair of graphs; thus
a slower algorithm can be used for the cases not covered by the proposed method.
Tang et al.155 in 2011 propose a graph matching algorithm based on the Dot
Product Representation of Graphs (DPRG) proposed by Scheinerman and Tucker,139
which represents each node using a numeric vector chosen so that each edge value
corresponds approximately to the dot product of the nodes connected by the edge; the
choice of the node vectors is formulated as a continuous optimization problem. The
proposed method is extended in a 2012 paper by the same authors.156 The 2011 paper
by Macrini et al.104 proposes a matching algorithm for bone graphs, which are a
representation for 3D shapes, using weighted bipartite graph matching. The paper by
Jiang et al.78 in 2011 presents a technique for inexact subgraph isomorphism based on
geometric hashing, requiring very little computational cost for the intended use case
of searching for several small input graphs within a large reference graph.
A novel optimization technique, Estimation of Distributions Algorithms (EDA),
has been succesfully used for inexact graph matching. EDA are somewhat similar to
genetic/evolutive algorithms, but the parameters of each tentative solution are
considered as random variables; a stochastic sampling process is used to produce the
next generation.
The paper by Bengoetxea et al.5 in 2002 proposes the use of EDA for inexact,
suboptimal graph matching, by associating each node of the rst graph to a random
variable whose possible values are the nodes of the second graph. In 2005, Cesar et al.80
formulate the inexact graph homomorphism as a discrete optimization problem, and
compare beam search, genetic algoritms and EDA for solving this problem. A dierent
approach, also based on a probabilistic framework, is proposed by Caelli and Caetano
in 200517; the matching is formulated as an inference problem on a Hidden Markov
Random Field (HMRF), for which an approximate solution is computed.
The 2004 paper by Dickinson et al.38 denes a graph similarity measure for the
special case of graphs having unique node labels, and proposes a hierarchical algorithm to eciently compute this measure. He et al.72 in 2004 propose an ad hoc
matching algorithm for skeleton graphs, that performs a linearization of the graphs,
and then uses string matching to nd an inexact correspondence. A similar approach
1450001-12

Graphs in Pattern Recognition in the Last 10 Years

is presented in the paper by Das et al. 35 in 2012 for graphs obtained by ngerprints.
In 2008, Gao et al.56 introduce a Graph Distance algorithm for the special case of
graphs whose nodes represent points in a 2D space, based on the Earth Mover
Distance (EMD). The 2009 paper by Emms et al.44 presents an original approach to
graph matching based on quantum computing, that uses the inherent parallelism of
some quantum physics phenomena if run on a (hypothetical) quantistic computer.

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

3. Other Problems
In this section, we will present some recent developments on graph problems that are
not, in a strict sense, forms of graph matching, but are related to matching either
because they provide a way of comparing two graphs (this is the case for graph
embeddings and graph kernels), or because they use a graph-based approach to group
input patterns into classes (in an unsupervised way for graph clustering, and in a
supervised or semi-supervised way for graph learning). We also mention some works
on other graph-related problems which are of specic interest as Pattern Recognition
basic tools, such as dimensionality reduction.
Graph embeddings and graph kernels are perhaps the most signicant novelty in
graph-based Pattern Recognition in the recent years. Although seminal works on
these elds were already present in earlier literature, it is in the last decade that these
techniques have gained popularity in the Pattern Recognition community. Gaertner
et al.55 presents an early survey on kernels applied to nonvectorial data. Bunke
et al.12 in 2005 present a survey of graph kernels and other graph-related techniques.
Bunke and Riesen14 in their 2011 paper propose a useful review on the topic of graph
kernels and graph embeddings; the same authors in 201215 extend this review and
present these techniques as a way to unify the statistical and structural approaches
in Pattern Recognition. Please note that, although it may seem that graph embeddings and kernels could help reducing the computational complexity of graph comparison, many of the proposed algorithm have a cost that is equal to or higher than
traditional matching methods (for instance, some embedding methods require
computing the GED, while others involve a cost that is related to the number of
graphs in the considered set). The main benet of the novel techniques is instead in
the availability of the large corpus of theoretically sound techniques from statistical
Pattern Recognition.
3.1. Graph embeddings
In the literature the term Graph embedding is used with two slightly dierent
meanings:
.

a technique that maps the nodes of a graph onto points in a vector space, in such a
way that nodes having similar structural properties (e.g. the structure of their
neighborhood) will be mapped onto points which are close in this space;
1450001-13

P. Foggia, G. Percannella & M. Vento

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

a technique that maps whole graphs onto points in a vector space, in such a way
that similar graphs are mapped onto close points (see Fig. 2).
References 16, 18, 45 and 130, discussed previously in Sec. 2.2, are an example of
the rst kind; also, the Dot Product Representation of Graphs139 mentioned in
Sec. 2.2 belongs to this category. Yan et al.175 show in their 2007 paper that most
commonly used dimensionality reduction techniques can be formulated as a graph
embedding algorithm of this kind. Their work is the basis for an embedding technique proposed by You et al.,177 called General Solution for Supervised Graph
Embedding (GSSGE ).
In the following subsections, we will mainly concentrate on the second kind of
graph embedding, presenting the relevant methods categorized according to the
main properties they attempt to preserve in the mapping.
3.1.1. Isometric embeddings
Methods in this category start from a distance or similarity measure between graphs,
and attempt to nd a mapping to vectors that preserve this measure.
Bonabeau,6 in a 2002 paper, proposes a technique based on a Self-Organizing Map
(SOM ), an unsupervised neural network adopting competitive learning, in order to
map graphs onto a bidimensional plane. Although the term embedding is not explicitly used, it can be considered a form of graph embedding. The mapping found by
the network is used both as an aid for the visualization of the data represented by the
graphs, and for clustering.
Also the 2003 paper by de Mauro et al.36 uses a Neural Network for graph
embedding. In particular, the proposed method works on directed acyclic graphs, and
uses a Recursive Neural Network. The network is trained by similarity learning: the
training set is made by pairs of graphs which have been manually labeled with a

Fig. 2. Graph Embedding: the mapping between graphs and points in a vector space is represented by the
graph name.
1450001-14

Graphs in Pattern Recognition in the Last 10 Years

similarity value, and the network aims to produce an output vector for each graph so
that the Euclidean distance between vectors is consistent with the similarity between
the corresponding graphs.
A recent paper by Jouili and Tabbone79 proposes a graph embedding technique
based on constant shift embedding, a framework proposed for the embedding of
nonmetric spaces, mainly applied to clustering problems.

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

3.1.2. Spectral embeddings


The embedding algorithms in this subsection are based on the exploitation of spectral properties of graphs, i.e. properties related to the eigenvalues and eigenvectors of
matrices representing the graphs, such as the adjacency matrix. Since spectral
properties are invariant with respect to node permutations, they ensure that graphs
with an isomorphic structure will be mapped to the same vectors.
Luo et al.101 in a 2003 paper propose the use of spectral features for graph
embedding; in particular, they decompose the adjacency matrix of a graph into its
principal eigenmodes, and then compute from them a vector of numerical features
(e.g. eigenmode volume, eigenmode perimeter, inter-eigenmode distances, etc.).
Also the 2005 paper by Wilson et al.170 uses spectral properties to dene a graph
embedding; in this case, the authors derive a set of polynomials from the spectral
decomposition of the Laplacian of the adjacency matrix, and use the coecients of
these polynomials as feature vectors.
Also the 2009 paper by Xiao et al.172 proposes a graph embedding based on
spectral properties; in particular the method uses the heat kernel, i.e. the solution of
the heat equation on the graph, to obtain a set of invariant properties used to obtain
a vector representation of the graph.
Xiao et al.173 in a 2011 paper present an embedding for hierachical graphs,
obtained by a hierarchical segmentation of images. Spectral features are computed
on the levels of the hierarchy, obtaining a xed size feature vector.
3.1.3. Subpattern embeddings
These methods are based on the detection, or the enumeration, of specic types of
subpatterns within the graphs to be embedded.
Torsello and Hancock158 in 2007 propose an embedding algorithm for trees. The
algorithm requires that all the trees to be embedded are known in advance. The
embedding is based on the construction of a Union Tree, which is a directed, acyclic
graph having all the considered trees as subgraphs; then each tree is represented by a
vector that encodes which nodes of the Union Tree are used by the tree.
Czech33 proposes in a 2011 paper an embedding method based on B-matrices,
which are a structure based on the path lengths between the nodes of a graph and are
invariant with respect to node permutations.
A recent paper by Luqman et al.103 presents a fuzzy multilevel embedding technique, that combines structural information of the graph and information from the
1450001-15

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

graph attributes using fuzzy histograms. The method uses an unsupervised learning
phase to nd the fuzzy intervals used in the representation.
In a 2011 paper, Gibert et al.60 present a graph embedding based on graphs of
words, which are an extension of the popular bag of words approach. The method
assumes that the graphs are obtained from images, with nodes corresponding to
salient points, and node attributes corresponding to visual descriptors of the points.
The method performs a quantization of the attribute space, constructing a codebook.
This codebook is used to produce an intermediate graph, called graph of words,
whose nodes are the codebook values, and whose edges correspond to the adjacency
in the original graph of nodes mapped to those codebook values. The nodes and edges
of the intermediate graph are labeled with the counts of the corresponding nodes/
edges of the original graph; then an histogram of these counts is used as the
embedding. Two 2012 papers by the same authors further develop this method: in
Ref. 62 the authors add a more Sophisticated procedure for constructing the codebook, while in Ref. 61 they use a large set of features and apply a feature selection
algorithm to determine the most signicant ones. The same authors, in a 2013
paper,63 propose a somewhat similar embedding technique, that removes the
assumptions that the graphs are obtained from images, and exploits also edge
attributes if they are present.
The 2010 paper by Richiardi et al.122 proposes two graph embedding techniques
specically tailored for graphs having the following constraints: the number of nodes
is xed across all the considered set of graphs, and a total ordering is dened in the
set of nodes. The authors show that a graph embedding exploiting these constraints
can outperform a more general one.
3.1.4. Prototype-based embeddings
These embedding methods assume that a set of prototype graphs is available, and
the mapping of a graph onto a vector space is based on the distances (obtained
according to a suitably dened distance function) of the graph from the prototypes.
This technique can be seen as a special case of the dissimilarity representations
introduced by Pekalska and Duin.117
The rst of these methods has been proposed in 2007 by Riesen et al.127 The
method has one prototype graph for each dimension of the vector space; the corresponding component of the vector is simply dened as the GED between the prototype and the graph to be embedded. The authors discuss several strategies for
choosing the prototypes from a training set, and evaluate them by using the
embedding for several classication tasks. In the same year, a paper by Riesen and
Bunke124 further develops this idea by proposing the use of several sets of randomly
chosen prototypes, and combining the classiers obtained for each of the corresponding embeddings to form a Multiple Classier System. The advantage is that the
resulting classier is more robust with respect to the risk of a poor choice for the
prototypes. A 2009 paper by Lee and Duin93 explores a similar idea, but instead of a
1450001-16

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

random selection of the prototypes, the proposed method creates dierent base
classiers by using node label information for extracting dierent sets of subgraphs
from the training set. In 2010, Lee et al.94 propose a similar method in which, instead
of extracting subgraphs, the node label information is used to alter the training
graphs without changing their size.
In a 2009 paper, Riesen and Bunke123 present a Lipschitz embedding for graphs.
Lipschitz embedding is usually employed to regularize vector spaces, but in this case
it is proposed as a method to construct a graph embedding. Basically, each component of the vector representation of a graph is deduced from a set of prototype
graphs; the value of the component is the mean distance (using GED) with the
corresponding set of prototypes (a dierent aggregation function than the mean
could be used). The sets of prototypes are constructed using a clustering of a training
set, based on the K-Medoids clustering algorithm. The same authors in another 2009
paper126 propose a method for reducing the dimensionality of this embedding, by
using Principal Component Analysis and Linear Discriminant Analysis. Bunke and
Riesen13 in 2011 propose an extension to this technique, which formulates the
problem of choosing the reference graphs as a feature selection: a rst embedding is
built using a large number of reference graphs; then a feature selection algorithm is
applied to the obtained vectors in order to select the most signicant features, and
only the reference graphs corresponding to these features are retained.
Also the 2012 paper by Borzeshi et al.8 addresses the problem of selecting the
reference graphs for graph embedding. The authors present several algorithms which
are based on a discriminative approach: they dene several objective functions to
measure how much the prototypes are able to discriminate between classes, and
select the prototypes by a greedy optimization of these functions.
3.2. Graph kernels
A graph kernel is a function that maps a couple of graphs onto a real number, and
has similar properties to the dot product dened on vectors. More formally, if we
denote with G the space of all the graphs, a graph kernel is a function k such as:
k : G  G ! R;
kG1 ; G2 kG2 ; G1
n X
n
X
i1 j1

ci  cj  kGi ; Gj  0

10

8G1 ; G2 2 G;

8 G1 ; . . . ; G2 2 G;

8 c1 ; . . . ; cn 2 R:

11
12

Equation (11) requires the function k to be symmetric, while Eq. (12) requires it to be
positive semi-denite.
Informally, a graph kernel can be considered as a measure of the similarity between two graphs; however its formal properties allow a kernel to replace the vector
dot product in several vector-based algorithms that use this operator (and other
functions related to dot product, such as the Euclidean norm). Among the many
1450001-17

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

Pattern Recogniton techniques that can be adapted to graphs using kernels we


mention Support Vector Machine classiers and Principal Component Analysis.
Kernels have been used for a long time to extend to the nonlinear case linear
algorithms working on vector spaces, thanks to the Mercer's theorem: given a kernel
function dened on a compact Hausdor space X, there is a vector space V and a
mapping between X and V such that the value of the kernel computed on two points
in X is equal to the dot product of the corresponding points in V . Thus a kernel can
be seen as an implicit way of performing an embedding into a vector space. Although
Mercer's theorem does not apply to graph kernels, in practice these latter can be used
as a theoretically sound way to extend a vector algorithm to graphs. Of course, the
performance of these algorithms strongly depend on the appropriateness (with respect to the task at hand) of the notion of similarity embodied in the graph kernel.
In their 2003 paper, Kashima et al. 82 specialize for the graph domain the idea of
marginalized kernels, a probabilistic technique for dening a kernel based on the
introduction of hidden variables. In this case, the hidden variable is a sequence of
node indices, generated according to a random walk on one of the graphs. Given a
value for the hidden variable, a kernel on sequences is computed using the sequence
of visited nodes and edges; the marginalized kernel is obtained by computing the
expected value (with respect to the joint distribution of the hidden and visible
variables) of this sequence kernel. Mahe and Vert105 in 2009 extend this technique to
trees, and present an application to molecular data.
Borgwardt and Kriegel7 in 2005 present a graph kernel that is based on paths,
instead of walks (a path is a walk without repeated nodes); in order to avoid the
exponential cost of enumerating all the paths in a graph, the authors propose a
scheme to use only the shortest path between any pair of nodes, since the shortest
paths can be computed in polynomial time.
Neuhaus and Bunke,112 in their 2006 paper, dene three graph kernels based on
GED. The rst kernel requires the choice of a zero pattern, a graph that, with respect
to the kernel, will behave similarly to a null vector. The authors show that this kernel
fulls the theoretical requirements of a kernel function, but its practical performance
is strongly aected by the choice of the zero pattern. The authors then introduce two
other kernels, obtained from the sum and the product of the rst kernel over a set of
zero patterns, and show that they have the same theoretical properties, but are more
robust with respect to the choice of these patterns.
In their 2009 paper, Neuhaus et al. 114 present three possible ways to use GED in
the denition of a kernel. The rst way is a diusion kernel, which turns an edit
distance matrix into a positive denite matrix satisfying the kernel properties, but
has the inconvenience that the set of graphs to which it is applied must be nite and
known a priori. The second way is a convolution kernel, which is based on a decomposition of the edit path between the two graphs into a sequence of substitution
operations; given a kernel for individual substitutions, this approach provides a
denition for a kernel between two graphs. The main drawback is the exponential
1450001-18

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

complexity with respect to the number of nodes, for which the authors suggest an
approximation. The third way is a random walk kernel, where the GED is used to
dene a fuzzy product graph, from which a kernel is obtained that evaluates the local
similarity of corresponding parts of the two graphs.
The 2012 paper by Gazere et al.59 presents two graph kernels. The rst, called
Laplacian kernel, is based on the GED (approximated using the algorithm by Riesen
and Bunke125). The product operation derived from the GED is not guaranteed to be
positive denite, and so does not have the formal properties of a kernel; so the
authors propose a technique to obtain from the distance matrix a positive denite
matrix, which is then used as the kernel. The second kernel, called the treelet kernel,
is based on treelets, which are all the possible trees having less than a xed number of
nodes (in the papers, treelets up to six nodes are considered); the kernel is computed
by counting the occurrences of each treelet in the graphs. This kernel can only be
used for unattributed graphs, while the Laplacian kernel can also be employed for
graphs having node and edge attributes. The same authors in Ref. 58 propose a
kernel that is also based on treelets, but instead of simply counting their occurrences,
uses a treelet edit distance to compare the treelets in one graph with those in the
other one, so as to be tolerant with respect to slight deformations of the graphs.
Grenier et al.66 in their 2013 paper propose a dierent treelet-based kernel, specically devised for chemioinformatics applications, that incorporates also information
on the position of each treelet within the graph.
Shervashidze et al.,145 in their 2009 paper, present a kernel based on the use of
graphlets, that is all the possible graphs having less than a xed number of nodes.
Also the graphlet kernel, as the previously mentioned treelet kernel, has the limitation of being applicable only to unlabeled graphs. The paper considers graphlets up
to ve nodes, and propose two dierent techniques to reduce the computational cost
of nding all the occurence of the graphlets in a large graph: the rst is a probabilistic
technique based on sampling, that replaces the exact number of graphlets with an
estimate that is ensured to converge in probability to the true value; the second
technique is applicable only to bounded-valence graphs, and it is based on an ecient
algorithm for enumerating, on this kind of graphs, all the paths up to a xed length.
An extension of the idea of graphlet kernels is introduced by Kondor et al. in
2009.88 The authors dene a set of graph invariants, called the graphlet spectrum,
based on the generalization of Fourier transforms over permutation groups. The
kernel based on these invariants has the advantages of being applicable to labeled
graphs, and of taking into account the position of the graphlets within the larger
graph, and not only their frequency of occurrence.
Bai and Hancock, in a 2013 paper,3 dene a novel kernel based on the Jensen
Shannon divergence, which is an information theoretic measure of entropy. To
apply this measure to graphs, the authors derive from each graph a probability
distribution, based on the random walks on the graphs. Rossi et al. in their paper133
propose an evolution of this method, dening a kernel that is similarly based on the
1450001-19

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

JensenShannon divergence, but uses continuous-time quantum walks instead of


classical random walks.
In a 2011 paper, Strug153 proposes a kernel specically devised for hierarchical
graphs, which is based on the combination of a tree kernel with a classical graph
kernel.
Lozano and Escolano propose the use of graph kernels in a slightly dierent
meaning as an aid to improve the performance of other operations on graphs. For
instance, in Ref. 99 a kernelized version of the classical Graduated Assignment
Graph Matching algorithm by Gold and Rangarajan, yielding an improvement in the
accuracy and the robustness to noise of the matching. In Ref. 100 the same authors
adopt a kernel for dening a graph-matching cost function, that is then used for a
kernelized version of two existing matching algorithms; in this paper the authors also
dene a kernel-based algorithm for constructing a prototype graph from a set of
graphs, using this technique for graph clustering.
A recent paper by Lee et al.92 investigates the dierent impact of structural
information and graph attributes within a graph kernel, using a kernel based on the
shortest paths, modied so as to have the possibility of changing the relative weights
of the two kinds of information. The authors show experimentally that these two
kinds are essentially dierent, and can reinforce each other. A similar conclusion
investigation, with the same conclusion, is reached using a GED for comparing the
graphs.
3.3. Graph clustering
The term Graph clustering is actually used in the literature with two dierent and
unrelated meanings, which may be both of interest for researchers working in Pattern
Recognition eld: in the rst sense, graphs are used to represent each of the objects to
be clustered, so the clustering is performed on a set of graphs (see Fig. 3). In the second
sense, which is the most frequently encountered, a single graph is used to represent the
structure of the space to which the objects belong, with a node for each object, and
edges encoding the relationships between pairs of objects (usually a similarity or a
distance measure is associated with each edge); in this case the clustering is performed
by partitioning the set of nodes of the graph according to some criterion (see Fig. 4). In
order to dierentiate between the two meanings of the term, we will speak of clustering
of graphs when referring to the rst sense, and graph-based clustering when referring
to the second one. This latter problem is related to graph-based segmentation, which
is a wide eld of research that is not included in this survey.
3.3.1. Clustering of graphs
Regarding the clustering of graphs, Gnter and Bunke69 in 2002 present an extension
to graphs of the Unsupervised Learning Vector Quantization (LVQ ). The algorithm
uses GED to evaluate the distance between an input graph and a cluster prototype, and an original algorithm, also based on GED, that computes the weighted
1450001-20

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

Fig. 3. An example of graph clustering in the rst meaning (clustering of graphs): each of the objects to
be clustered is represented by a graph.

combination of two graphs (by determining the minimal set of edit operations to
transform the rst graph into the second, and then choosing a subset of these
operations depending on the weight), which is used for updating the winning prototype. In 2003 the same authors70 propose an extension of this method, by introducing a set of clustering validation indices to choose the optimal number of LVQ
nodes.
Serratosa et al.141 propose an algorithm for the clustering of graphs based on
Function-Described Graphs, which are Attributed Relational Graphs extended with

Fig. 4. An example of graph clustering in the second meaning (graph-based clustering): the clustering is
performed by partitioning the set of nodes of a single graph.
1450001-21

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

information about constraints on the joint probabilities of nodes and edges. The
algorithm is based on an incremental, hierarchical clustering strategy.
Also the 2011 paper by Jain and Obermayer77 presents a method for the clustering of graphs based on the Vector Quantization with the k-Means algorithm. The
proposed algorithm uses an embedding of graphs into Riemannian orbitfolds, based
on GED, to perform the quantization. The authors present an extensive discussion
of the theoretical properties of the proposed approach, providing some necessary
conditions for optimality of the found clustering and for statistical consistency;
the authors also discuss the impact of possible approximations for reducing the
computational cost.
3.3.2. Graph-based clustering
Among the recent algorithms proposed for graph-based clustering, the paper by
Guigues et al.68 in 2003 denes the so called cocoons, which are connected subgraphs
characterized by the fact that the maximum dissimilarity between nodes within the
subgraph is less than the minimum dissimilarity between a node within the subgraph
and an outside node. The authors demonstrate that the cocoons of a graph form a
hierarchy, and dene an algorithm for constructing this hierarchy, that can be used
for a hierarchical clustering of the nodes of the graph. The same authors in Ref. 67
present a dierent method for obtaining a hierarchical representation, applied to
image representation and segmentation.
The 2006 paper by Br
as Silva et al.10 proposes an algorithm that is based on the
graph coloring problem. Graph coloring involves assigning labels (called colors) to
the nodes of a graph so that adjacent nodes have dierent colors, with the goal of
minimizing the total number of colors used. The proposed clustering algorithm uses a
greedy coloring technique from the literature, and then uses the resulting color assignment as an aid to decide how to aggregate the nodes into clusters.
Grady and Schwartz65 in 2006 present a graph-based clustering technique based
on continuous optimization. The functional to be minimized is chosen so as to have a
linear optimization problem, which can be solved with less computational cost and
more numerical stability than other functionals used in graph-based clustering.
However, the algorithm requires the choice of a ground node, which can aect the
resulting partition; the authors propose a criterion for xing this node, but warn that
this might not yield the oprimal performance, and so the method best suited for
applications where an interactive form of clustering is required, allowing the user to
change the ground node until a satisfactory clustering is found. A recent paper by
Couprie et al.30 presents a generalized energy functional, which is demonstrated to be
equivalent, by choosing the appropriate parameter values, to several optimizationbased techniques used for clustering and segmentation, such as graph cuts.
The 2006 paper by Frnti et al.54 proposes a graph-based technique, which uses
an approximate nearest neighbor graph, to speed up an agglomerative clustering
algorithm.
1450001-22

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

Dhillon et al.37 in 2007 propose a multilevel graph-based clustering algorithm that


exploits a theoretical relation between some kernels and some graph-based spectral
clustering algorithms to perform the clustering with the same properties of a spectral
method but without the computational cost of computing the eigenvectors of the
graph.
Foggia et al.52 in 2008 propose a graph-based clustering that is based on the
Minimum Spanning Tree, used in combination with the Fuzzy C-Means (FCM )
algorithm to determine automatically the clustering threshold. The method has been
further extended in Ref. 186.
The 2008 paper by Laskaris and Zafeiriou91 introduces a graph-based clustering
algorithm that is based on FCM. Namely, FCM is used as preprocessing step, with
the task of dividing the input data into a large number of clusters (overclustering).
Then, the found clusters are used to construct a graph-based representation, the
connectivity graph, with nodes corresponding to the cluster centroids, and edges to
neighborhood relations among the clusters. The connectivity graph is used with
several graph-based algorithms to nd a more accurate clustering, choosing automatically the optimal number of clusters, and for dimensionality reduction.
Zanghi et al.180 propose in their 2008 paper a graph-based clustering method based
on a probabilistic formulation of the problem, and using the ErdsRenyi mixture
model for random graphs. The EM algorithm is used to solve the probabilistic
problem. An extension of this method is dened in Ref. 181; the new algorithm adds
the ability to use node information (in the form of node feature vectors) in addition to
edge information representing the similarity of the corresponding data points.
The 2009 paper by Kim and Choi84 presents an algorithm for graph-based clustering that uses the decomposition of the graph into r-regular subgraphs,
i.e. connected subgraphs whose nodes are adjacent to exactly r other nodes. The
decomposition is reduced to a continuous optimization problem and solved using
Linear Programming techniques. After the decomposition, a renement step is used
to prune inconsistent edges and to remove outliers.
Wang et al.167 propose a clustering technique, called Integrated KL clustering,
that is a hybrid between a traditional clustering approach (the K-means algorithm)
and a graph-based clustering based on normalized graph cuts. The method should be
convenient in situations where the input data are partly described by a feature
vector, and partly by a set of similarity/dissimilarity relations encoded using a graph
structure.
Mimaroglu and Erdil110 in a 2011 paper propose a graph-based method for
combining the results of several clustering algorithms. The method is given as input
the results of a set of clustering algorithms applied to the same data; dierent
algorithms can be used, or the same algorithm with dierent parameters. The
method builds a graph with nodes corresponding to data points, and edges encoding
the number of clustering algorithms that have assigned two data points to the same
cluster. Then, the nodes of this graph are clustered so as to maximize the consensus
among the dierent clustering algorithms, using a greedy search technique. The
1450001-23

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

authors report that the nal clustering obtained by the method is closer to a manual
partition of the data, and is less inuenced by the choice of parameters than the
initial algorithms.
Nie et al.115 propose a graph-based clustering technique that uses a new formulation of the clustering problem, called the l1 -norm graph clustering, where the goal is
expressed as the minimization of the L1 norm of a suitable dened vector; this
formulation should be more robust with respect to noise and outliers.
The 2012 paper by Tabatabaei et al.154 presents a graph-based clustering algorithm where the clustering goal is formulated in terms of minimizing the normalized
cut (Ncut ) metric. The clustering is performed using a greedy, agglomerative algorithm, followed by a renement procedure that evaluates the opportunity of moving
the boundary nodes of each cluster to a neighboring one.
In 2012, Ducornau et al.41 propose a hierarchical algorithm for hypergraph-based
clustering, that is a generalization of graph-based clustering. The algorithm works by
performing a rst level partitioning of the nodes using a spectral technique, and then
the obtained partition is recursively rened.
Shang et al.144 in their 2012 paper propose two graph-based algorithms for the
co-clustering problem, which is aimed at nding at the same time coherent subsets of
the datapoints and coherent subsets of the features used to represent them. The
algorithms adopt iterative optimization schemes, based on graph Laplacian.
3.4. Graph learning
Several learning methods use a graph-based structure as part of the learning process.
In some cases, the individual patterns are represented by graphs, and often also the
class descriptions have a graph-based representations; in such cases, often some form
of graph matching is involved in the algorithm (see Fig. 5). In other cases, a graph
structure represents the whole input space, with nodes corresponding to individual
patterns, and edges representing some sort of proximity or similarity relation. We
will use the terms learning of graphs for the rst case, and graph-based learning for
the second.
3.4.1. Learning of graphs
In 2005, Neuhaus and Bunke111 present a method to learn a GED using a SelfOrganizing Map. The method is given a set of graphs with class labels, and learns the
edit costs so as to ensure that graphs in the same class have a smaller GED than
graphs belonging to dierent classes. The same authors propose an improved algorithm for solving the same problem in a 2007 paper.113 In this latter work, they
reformulate the learning of the graph edit costs in a probabilistic framework, and use
the Expectation Maximization algorithm to optimize these costs in the Maximum
Likelihood sense.
Also the paper by Serratosa et al.143 in 2011 proposes a method for learning the
edit costs of a GED. In this case, the method is based on an Adaptive Learning
1450001-24

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

(a)

(b)

(c)

Fig. 5. An illustration of graph learning: (a) A set of objects made of three dierent kinds of parts (circles,
triangles, rectangles); (b) the representation in terms of graphs (node attributes are the kinds of parts,
while edges representing the only spatial relation \above" and therefore they do not have attributes); (c)
the corresponding learned class description, a prototype containing the common substructure: the question
mark on the prototype nodes represents a generic value (a don't care) for the corresponding attribute.

paradigm, in which the system is sequentially given new graphs and attempts to
classify them, and only if the class is dierent from the one a human expert would
have chosen, a feedback is given to the system and it adapts the edit costs. A 2011
paper by Sole-Ribalta and Serratosa148 provides a further elaboration on this
method, by considering, once the edit costs are xed, the formal properties of the
space of the possible matchings.
A similar problem is addressed by the 2012 paper by Leordeanu et al.,96 where
learning (both supervised and unsupervised) is used to obtain the parameters of a
graph matching algorithm based on spectral properties.
A 2008 paper by Maulik108 presents an algorithm for nding repeated subgraphs
within large graphs, which can be considered an unsupervised form of graph-based
learning. This can be used for data mining in domains suitable to a structural representation, e.g. web pages or molecular databases. The algorithm uses Evolutionary Programming to perform the search, with a tness function based on the
compression of the original graph attainable by the detection of the repeated
substructure.
In their 2009 papers, Ferrer et al.49,50 propose an algorithm for computing the
median graph, that is the graph within a set of graphs that minimizes the sum of
graph distances from the other graphs. The computation of the median graph can be
1450001-25

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

considered a form of learning in the graph domain, since the median graph can be
used as a prototype for a set of graphs. The proposed method is based on genetic
algorithms, and performs a reduction of the search space by exploiting a novel
theoretical bound on the sum of distances for the particular graph distance measure
adopted (which is based on the maximum common subgraph).
Ferrer et al. present a dierent median graph algorithm in 2010,51 which is based
on the graph embedding technique by Riesen and Bunke.123 In particular, the
method proposed by Ferrer et al. computes the median graph by converting the
graphs into vectors using the cited graph embedding, nding the median vector of
the set and then converting this vector back into a graph. In a 2011 paper,48 the same
authors present an improved procedure for performing this last step of the algorithm.
Jain and Obermayer, in their 2010 paper75 discuss the mean and the median of a
set of graphs, using a theoretical formulation based on Riemannian orbitfolds, and
present some sucient conditions to ensure that the estimators of the mean and the
median are consistent with an underlying probability distribution of the graphs.
Also the paper by Raveaux et al.120 deals with learning a prototype for a set of
graphs. Four dierent kinds of prototypes are considered: median graphs, generalized
median graphs, discriminant graphs and generalized discriminant graphs. Discriminant graphs are prototypes chosen so as to maximize the performance of a Nearest
Neighbor classier over a labeled training set. The generalized versions of median
graphs and discriminant graphs are obtained by lifting the restriction that the
prototype must be a member of the training set. All the four kinds of prototypes are
computed using a Genetic Algorithm, with dierent chromosome encodings and
tness functions.
3.4.2. Graph-based learning
Culp and Michailidis,32 in their 2008 paper, propose a semi-supervised learning algorithm based on graphs. Semi-supervised learning is a form of machine learning in
which only a subset of the training data has class labels. The proposed method
assumes that the structure of the input space is described as a graph, in which nodes
are the input samples and edges encode the neighborhood relations; this graph
structure is used to assign a label to unlabeled training samples during the learning
process. A similar technique is proposed by Elmoataz et al.43 for graph-based regularization on weighted graphs.
Also the 2012 paper by Rohban and Rabiee132 is related to graph-based semisupervised learning. In particular, the authors investigate the preliminary step of
graph construction: given a set of datapoints in a metric space, how a graph structure
can be constructed so that graph-based semi-supervised learning can be applied
eectively; the authors propose a supervised graph construction algorithm based on
the optimization of a smoothness functional and showing that the use of neighborhood graphs based on this method outperforms the k-NN technique commonly used
for this task.
1450001-26

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

The 2012 paper by Wang et al.165 introduce a novel technique to construct the
graph structure for graph-based semi-supervised learning. The authors propose the
k-Regular Nearest Neighbor graph (k-RNN) instead of the more common k-NN
graph. In the k-RNN graph, k is the average number of neighbors, and the graph is
constructed so as to minimize total weights (representing distances) of the edges. The
authors demonstrate the performance improvement of this technique in conjunction
with the Manifold-ranking semi-supervised algorithm by Zhou et al.183
Shiga and Mamitsuka146 in their 2012 paper present another graph-based, semisupervised learning algorithm. The novel aspect of this proposal is that it integrates several graphs for representing dierent sources of evidence regarding
the similarity of the input patterns. The algorithm combines spectral clustering
with label propagation. Finally, Zhuang et al.185 propose a semi-supervised
learning algorithm that uses a graph-based formulation of the Non-negative
Matrix Factorization.
In a 2009 paper, Hu et al.74 propose an unsupervised algorithm for active learning
based on graphs. An active learning algorithm, when presented with a new sample,
must decide whether it can classify it according to its current knowledge, or must
have it classied by a human; in this latter case, the algorithm then uses the information provided by the human about the true class to improve the future classication performance. The algorithm proposed by Hu et al. uses a two-level graphbased clustering of the samples to perform this task.
3.5. Miscellaneous problems
Leordeanu and Hebert in 200595 propose a graph-based technique for nding the
correspondence between two sets of feature points.
In 2006, Bunke et al.11 dene a technique for graph prediction : the aim, given a
sequence of graphs, is to predict whether the next graph in the sequence will contain
or not a certain node or edge. Two algorithms are presented, which both learn how to
predict the occurrence of a node/edge using a training set of past sequences; one of
them is based on the frequencies of xed-length subsequences, while the other builds
a decision tree from the training examples.
Luo et al.102 in 2006 propose a linear deformable model as a generative model for
sampling the space of graph deformations, as required in some statistical algorithm.
The model is based on a spectral graph embedding and Principal Component
Analysis.
Kokiopoulou and Saad86 present a graph-based technique for dimensionality reduction of vector data, aiming at nding a projection of a high dimensionality input
space to a low dimensionality target space so that vectors of the same class are close
in the target space, and vectors of dierent classes are distant. Given a labeled
training set, the method builds an ty graph encoding the intra-class proximity relations and a repulsion graph encoding inter-class proximity relations. The projection
is then constructed using spectral properties of these two graphs.
1450001-27

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

Yu et al.178 present a graph-based dimensionality reduction technique, called


Mixture Graph Semi-Supervised Dimensionality Reduction, characterized by the use
of a Mixture Graph, that is a graph structure combining several graphs obtained
from random subspaces of the dataset. Also the 2012 paper by Tseng et al.159 presents a graph-based technique for dimensionality reduction, called Adaptive Locality
Preserving Projection, which uses a graph structure obtained from the neighborhood
information of the data points, and formulates the dimensionality reduction as a
continuous optimization problem expressed in terms of this graph. Another technique used for the representation of large dimensionality vectorial data is proposed
by Yang et al.,176 that dene a graph-based algorithm that extends the Non-negative
Matrix Factorization, a technique for expressing the data as a combination of a
smaller number of vectors. A further development is proposed by Wang et al.,166 that
extends the technique to tensorial data and improves the speed of the algorithm. Cai
et al.20 present a dierent graph-based technique for the same problem, called Graph
Regularized Non-negative Matrix Factorization.
The 2010 paper by Kokiopoulou and Frossard85 proposes a graph-based method
to improve the output of a classier when multiple input vectors are received at the
same time for the object to be classied, for example coming from multiple acquisition devices or from consecutive time instants. The method constructs a graph
representing the neighborhood relations in the input space, and then propagates the
class labels among adjacent nodes, using continuous optimization techniques.
A similar problem, semi-supervised classication, is addressed by Mantrach
et al.106 in 2011: given a large, partially labeled set of samples, with a graph structure
representing their similarity and neighborhood relations, the aim is to propagate the
class labels to the unlabeled nodes. This paper focuses on very large graphs (millions
of nodes), and proposes a method, based on an approximation of random walks, that
has linear time complexity with respect to the number of nodes and edges.
Lin et al.,97 in a 2010 paper, propose an algorithm, specialized for the particular
case of graphs obtained from primal sketches of 2D images, that performs at the same
time the matching between two graphs and the partition of the nodes of the graphs
into layers, that should ideally correspond to dierent objects in the images. The
algorithm depends heavily on the attributes of graph nodes for nding the matching;
the problem is formulated using a graph coloring framework, and is solved using a
Monte Carlo Markov Chain (MCMC ) approach.
Elghazel and Hacid,42 in a 2011 paper, propose a search algorithm for graph
databases that performs an aggregated search : if the database does not contain a
single graph satisfying the query graph, the algorithm nds a minimal set of graphs
which must be combined to provide an answer to the query. The algorithm is based
on the Maximum Common Subgraph, found using the MaxCliqueDyn algorithm.87
The 2011 paper by Jain and Obermayer76 extends the denition of a Gaussian
distribution to graphs, by embedding the graphs into a Riemannian orbitfold (using
the same theoretical framework of their paper Ref. 77, described in Sec. 3.3.1), using
1450001-28

Graphs in Pattern Recognition in the Last 10 Years

a GED computed with the BronKerbosh algorithm. The authors also dene an
approximate algorithm for estimating the parameters of the distribution, and show
how it can be used to implement a naive Bayesian classier.

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

4. Conclusion
The analysis of the recent literature on graph-based techniques shows there is still a
warm interest toward the use of this important data structure for facing Pattern
Recognition problems. Up to now, important achievements have been obtained.
Among these it is worth citing the most relevant ones.
Ecient algorithms for matching large graphs, both exactly and inexactly, with
dierent morphism types are now available, making solvable the problem of graph
matching, at least in those applicative domains where it is not required to adopt large
data structures. The solid theoretical framework of exact graph matching methods
together with the extremely important property that contextual semantic information can be simply exploited at the best, make them practically fundamental in many
applications. In this area, the challenge is the further improvement of the methods
with respect to the computational point of view, especially with reference to maximum common subgraph problem. Furthermore, the denition of learning methods
able to obtain optimal prototype graphs, as a generalization of a given set of labeled
graphs, would be a great advance of this research area, as the exact generalization of
graphs is mainly done manually.
Probably the most relevant milestone in the recent history of graphs is the GED
as a tool for inexact graph matching. In the years, the GED has assumed a central
role as a technique to apply on graphs many Statistical Pattern Recognition methods: some examples are the NN, LVQ and K-means. Apart for future extensions to
other learning and classication algorithms, the problems appearing now as bottlenecks are mainly two: the edit distance is computationally prohibitive (exponential)
if we are interested in optimal solutions, and in any case, it still remains high, if we
settle for approximate ones. When the edit distance is required to be evaluated many
times (as it often happens) this problem becomes much more crucial. Probably, the
most important theoretical limit is the fact that the edit distance is not embedded
into a proper continuous space: graphs are points, but between points there is
nothing.
Much eorts have been spent in the recent years to graph kernels with the aim of
using kernel machines adapted to graphs, as SVM, PCA and MLP and graph
embedding as a mean for using all the statistical learning and classication algorithms. To this concern, the extension of kernels to graphs seems to be promising, but
a theoretical guarantee about the suitability of the used kernel is missing and this
could be a topic likely to be further explored, together with the improvement of
graph-based similarity kernels.
As regards the embedding methods, the extremization of the unication between
statistical and structural PR worlds are progressively gaining the attention of
1450001-29

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

the researchers in this eld. The approach is too young for drawing denite conclusions regarding its eectiveness, but a philosophical question is unavoidable.
Graphs have been introduced for representing objects by parts and relations and
for (possibly) using contextual semantic information during their comparison. So,
when we go back, and are obliged to use them as a whole, is it still convenient to
use them?
The researchers advocating the \graphs from beginning to end" approach could
reinforce: \Is it really eective to solve a problem starting with graph representations, and going back to vectors, risking to lose important chunks of discriminative
power? If so, why do not you renounce to use graphs, and directly use vector-based
descriptions from the start?" The opposite faction could reply: \Why do you insist
on describing the world by graphs if there is still a lack of completely assessed
and computationally acceptable algorithms for classifying and for learning graph
prototypes?"
What should we expect in the future? On one side, we could gure out a renaissance for the exact and inexact graph matching methods pushed by researchers
envisioning the possibility of adopting these techniques on new elds of Pattern
Recognition, bioinformatics just for citing one, where the availability of more and
more powerful calculators could now and in the future make possible solving real
problems in a nite time, exploiting the representation eectiveness of graphs. On
the other side, a consolidation from both the theoretical and the applicative points of
view of graph kernels and graph embedding has to be predicted; this will conrm the
trend of the last decade according to which they attract the attention of researchers
much more of classical graph matching methodologies. In fact, the graph kernelization and embedding represent for many researcher some readily available (almost
o-the-shelf) tools for facing new and more complex Pattern Recognition problems
without the need of digging into the technicalities of the graphs.

References
1. S. Auwatanamongkol, Inexact graph matching using a genetic algorithm for image
recognition, Pattern Recogn. Lett. 28(12) (2007) 14281437.
2. A. D. Bagdanov and M. Worring, First order gaussian graphs for ecient structure
classication, Pattern Recogn. 36(6) (2003) 13111324.
3. L. Bai and E. R. Hancock, Graph kernels from the jensen-shannon divergence, J. Math.
Imag. Vis. 47(12) (2013) 6069.
4. X. Bai and L. J. Latecki, Path similarity skeleton graph matching, IEEE Trans. Pattern
Anal. Mach. Intell. 30(7) (2008) 12821292.
5. E. Bengoetxea, P. Larraaga, Is. Bloch, A. Perchant and C. Boeres, Inexact graph
matching by means of estimation of distribution algorithms, Pattern Recogn. 35(12)
(2002) 28672880.
6. E. Bonabeau, Graph multidimensional scaling with self-organizing maps, Inform. Sci.
143 (2002) 159180.
7. K. M. Borgwardt and H.-P. Kriegel, Shortest-path kernels on graphs, Fifth IEEE Int.
Conf. Data Mining (IEEE, 2005), pp. 7481.
1450001-30

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

8. E. Z. Borzeshi, M. Piccardi, K. Riesen and H. Bunke, Discriminative prototype selection


methods for graph embedding, Pattern Recogn. 46(6) (2013) 16481657.
9. N. Bourbakis, P. Yuan and S. Makrogiannis, Object recognition using wavelets, L-G
graphs and synthesis of regions, Pattern Recogn. 40(7) (2007) 20772096.
10. H. Bras Silva, P. Brito and J. Pinto da Costa, A partitional clustering algorithm validated by a clustering tendency index based on graph theory, Pattern Recogn. 39(5)
(2006) 776788.
11. H. Bunke, P. Dickinson, C. Irniger and M. Kraetzl, Recovery of missing information in
graph sequences by means of reference pattern matching and decision tree learning,
Pattern Recogn. 39(4) (2006) 573586.
12. H. Bunke, C. Irniger and M. Neuhaus, Graph matchingchallenges and potential solutions, Image Analysis and ProcessingICIAP 2005, LNCS, Vol. 3617 (Springer, 2005),
pp. 110.
13. H. Bunke and K. Riesen, Improving vector space embedding of graphs through feature
selection algorithms, Pattern Recogn. 44(9) (2011) 19281940.
14. H. Bunke and K. Riesen, Recent advances in graph-based pattern recognition with
applications in document analysis, Pattern Recogn. 44(5) (2011) 10571067.
15. H. Bunke and K. Riesen, Towards the unication of structural and statistical pattern
recognition, Pattern Recogn. Lett. 33(7) (2012) 811825.
16. T. Caelli and S. Kosinov, An eigenspace projection clustering method for inexact graph
matching, IEEE Trans. Pattern Anal. Mach. Intell. 26(4) (2004) 515519.
17. T. Caelli and T. Caetano, Graphical models for graph matching: Approximate models
and optimal algorithms, Pattern Recogn. Lett. 26(3) (2005) 339346.
18. T. Caelli and S. Kosinov, Inexact graph matching using eigen-subspace projection
clustering, Int. J. Pattern Recogn. Artif. Intell. 18(3) (2004) 329354.
19. T. S. Caetano, J. J. McAuley, L. Cheng, Q. V. Le and A. J. Smola, Learning graph
matching, IEEE Trans. Pattern Anal. Mach. Intell. 31(6) (2009) 10481058.
20. D. Cai, X. He, J. Han and T. S. Huang, Graph regularized nonnegative matrix factorization for data representation, IEEE Trans. Pattern Anal. Mach. Intell. 33(8) (2011)
15481560.
21. M.-C. Chang and B. B. Kimia, Measuring 3D shape similarity by graph-based matching
of the medial scaolds, Comput Vis Image Understand. 115(5) (2011) 707720.
22. F. Chevalier, J.-P. Domenger, J. Benois-Pineau and M. Delest, Retrieval of objects
in video by similarity based on graph matching, Pattern Recogn. Lett. 28(8) (2007)
939949.
23. M. Cho, J. Lee and K. Lee, Reweighted random walks for graph matching, in Computer
Vision ECCV 2010, eds. K. Daniilidis, P. Maragos and N. Paragios, LNCS, Vol. 6315
(2010), pp. 492505.
24. A. S. Chowdhury, S. M. Bhandarkar, R. W. Robinson and J. C. Yu, Virtual craniofacial
reconstruction using computer vision, graph theory and geometric constraints. Pattern
Recogn. Lett. 30(10) (2009) 931938.
25. W. J. Christmas, J. Kittler and M. Petrou, Structural matching in computer vision
using probabilistic relaxation, IEEE Trans. Pattern Anal. Mach. Intell. 17(8) (1995)
749764.
26. D. Conte, P. Foggia, C. Sansone and M. Vento, Thirty years of graph matching in
pattern recognition, Int. J. Pattern Recogn. Artif. Intell. 18(3) (2004) 265298.
27. D. Conte, P. Foggia, J.-M. Jolion and M. Vento, A graph-based, multi-resolution
algorithm for tracking objects in presence of occlusions, Pattern Recogn. 39(4)
(2006) 562572.

1450001-31

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

28. D. Conte, P. Foggia, C. Sansone and M. Vento, How and why pattern recognition and
computer vision applications use graphs, in Applied Graph Theory in Computer Vision
and Pattern Recognition, eds. A. Kandel, H. Bunke and M. Last, Studies in Computational Intelligence, Vol. 52 (Springer Berlin Heidelberg, 2007), pp. 85135.
29. D. J. Cook, N. Manocha and L. B. Holder, Using a graph-based data mining system to
perform web search, Int. J. Pattern Recogn. Artif. Intell. 17(5) (2003) 705720.
30. C. Couprie, L. Grady, L. Najman and H. Talbot, Power watershed: A unifying graphbased optimization framework, IEEE Trans. Pattern Anal. Mach. Intell. 33(7) (2011)
13841399.
31. T. Cour, P. Srinivasan and J. Shi, Balanced graph matching. Adv. Neural Inform.
Process. Syst. 19 (2007) 313322.
32. M. Culp and G. Michailidis, Graph-based semisupervised learning, IEEE Trans. Pattern
Anal. Mach. Intell. 30(1) (2008) 174179.
33. W. Czech, Graph descriptors from b-matrix representation, in Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658 (Springer, 2011), pp. 1221.
34. N. Dahm, H. Bunke, T. Caelly and Y. Gao, Topological features and iterative node
elimination for speeding up subgraph isomorphism detection in 21st Int. Conf. Patt.
Recognition (2012), pp. 11641167.
35. P. Das, K. Karthik and B. C. Garai, A robust alignment-free ngerprint hashing algorithm based on minimum distance graphs, Pattern Recogn. 45(9) (2012) 33733388.
36. C. de Mauro, M. Diligenti, M. Gori and M. Maggini, Similarity learning for graph-based
image representations, Pattern Recogn. Lett. 24(8) (2003) 11151122.
37. I. S. Dhillon, Y. Guan and B. Kulis, Weighted graph cuts without eigenvectors:
A multilevel approach, IEEE Trans. Pattern Anal. Mach. Intell. 29(11) (2007) 1944
1957.
38. P. J. Dickinson, M. Kraetzl, H. Bunke, M. Neuhaus and A. Dadej, Similarity measures
for hierarchical representations of graphs with unique node labels. Int. J. Pattern
Recogn. Artif. Intell. 18-3(3) (2004) 425442.
39. P. J. Dickinson, H. Bunke, A. Dadej and M. Kraetzl, Matching graphs with unique node
labels, Pattern Anal Appl, 7 (2004) 243254.
40. O. Duchenne, F. Bach, In-So Kweon and J. Ponce, A tensor-based algorithm for highorder graph matching, IEEE Trans. Pattern Anal. Mach. Intell. 33(12) (2011) 2383
2395.
41. A. Ducournau, A. Bretto, S. Rital and B. Laget, A reductive approach to hypergraph
clustering: An application to image segmentation, Pattern Recogn. 45(7) (2012) 2788
2803.
42. H. Elghazel and M.-S. Hacid, Aggregated search in graph databases: Preliminary results,
in Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658 (Springer,
2011), pp. 10471060.
43. A. Elmoataz, O. Lezoray and S. Bougleux, Nonlocal discrete regularization on weighted
graphs: A framework for image and manifold processing, IEEE Trans. Image Process.
17(7) (2008) 10471060.
44. D. Emms, R. C. Wilson and E. R. Hancock, Graph matching using the interference of
continuous-time quantum walks, Pattern Recogn. 42(5) (2009) 9851002.
45. F. Escolano, B. Bonev and M. Lozano, Information-geometric graph indexing from bags
of partial node coverages, in Graph-Based Representations in Pattern Recognition,
LNCS, Vol. 6658 (Springer, 2011), pp. 5261.
46. S. Fankhauser, K. Riesen and Horst Bunke, Speeding up graph edit distance computation through fast bipartite matching, in Graph-Based Representations in Pattern
Recognition, LNCS, Vol. 6658 (Springer, 2011), pp. 102111.
1450001-32

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

47. S. Fankhauser, K. Riesen, H. Bunke and P. J. Dickinson, Suboptimal graph isomorphism


using bipartite matching, IJPRAI 26(6) (2012) 127.
48. M. Ferrer, D. Karatzas, E. Valveny, I. Bardaji and H. Bunke, A generic framework for
median graph computation based on a recursive embedding approach, Comput. Vis.
Image Understand. 115(7) (2011) 919928.
49. M. Ferrer, E. Valveny and F. Serratosa, Median graph: A new exact algorithm using a
distance based on the maximum common subgraph, Pattern Recogn. Lett. 30(5) (2009)
579588.
50. M. Ferrer, E. Valveny and F. Serratosa, Median graphs: A genetic approach based on
new theoretical properties, Pattern Recogn. 42(9) (2009) 20032012.
51. M. Ferrer, E. Valveny, F. Serratosa, K. Riesen and H. Bunke, Generalized median graph
computation by means of graph embedding in vector spaces, Pattern Recogn. 43(4)
(2010) 16421655.
52. P. Foggia, G. Percannella, C. Sansone and M. Vento, A graph-based algorithm for
cluster detection, Int. J. Pattern Recogn. Artif. Intell. 22(5) (2008) 843860.
53. P. Foggia and M. Vento, Graph matching techniques for computer vision, in GraphBased Methods in Computer Vision (IGI Global, 2012), pp. 141.
54. P. Frnti, O. Virmajoki and V. Hautamaki, Fast agglomerative clustering using a k-nearest
neighbor graph, IEEE Trans. Pattern Anal. Mach. Intell. 28(11) (2006) 18751881.
55. T. Gaertner, J. W. Lloyd and P. A. Flach, Kernels for Structured Data (Springer, 2003).
56. X. Gao, B. Xiao, D. Tao and X. Li, Image categorization: Graph edit direction histogram, Pattern Recogn. 41(10) (2008) 31793191.
57. X. Gao, B. Xiao, D. Tao and X. Li, A survey of graph edit distance, Pattern Anal. Appl.
13 (2010) 113129.
58. B. Gazere, L. Brun and D. Villemin, Graph kernels: Crossing information from dierent
patterns using graph edit distance, in Structural, Syntactic, and Statistical Pattern
Recognition, Lecture Notes in Computer Science, Vol. 7626 (2012), pp. 4250.
59. B. Gazere, L. Brun and D. Villemin, Two new graphs kernels in chemoinformatics,
Pattern Recogn. Lett. 33(15) (2012) 20382047.
60. J. Gibert, E. Valveny and H. Bunke, Dimensionality reduction for graph of words
embedding, in Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658
(Springer, 2011), pp. 2231.
61. J. Gibert, E. Valveny and H. Bunke, Feature selection on node statistics based
embedding of graphs, Pattern Recogn. Lett. 33(15) (2012) 19801990.
62. J. Gibert, E. Valveny and H. Bunke, Graph embedding in vector spaces by node attribute statistics, Pattern Recogn. 45(9) (2012) 30723083.
63. J. Gibert, E. Valveny and H. Bunke, Embedding of graphs with discrete attributes via
label frequencies, Int. J. Pattern Recogn. Artif. Intell. 27(3) (2013) 126.
64. M. Gori, M. Maggini and L. Sarti, Exact and approximate graph matching using random
walks, IEEE Trans. Pattern Anal. Mach. Intell. 27(7) (2005) 11001111.
65. L. Grady and E. L. Schwartz, Isoperimetric graph partitioning for image segmentation,
IEEE Trans. Pattern Anal. Mach. Intell. 28(3) (2006) 469475.
66. P.-A. Grenier, L. Brun and D. Villemin, Treelet kernel incorporating chiral information,
in GbRPR (2013), pp. 132141.
67. L. Guigues, J. Pierre Cocquerez and H. Le Men, Scale-sets image analysis, Int. J.
Comput. Vis. 68(3) (2006) 289317.
68. L. Guigues, H. Le Men and J.-P. Cocquerez, The hierarchy of the cocoons of a graph and
its application to image segmentation, Pattern Recogn. Lett. 24(8) (2003) 10591066.
69. S. Gnter and H. Bunke, Self-organizing map for clustering in the graph domain, Pattern
Recogn. Lett. 23(4) (2002) 405417.
1450001-33

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

70. S. Gnter and H. Bunke, Validation indices for graph clustering, Pattern Recogn. Lett.
24(8) (2003) 11071113.
71. E. R. Hancock and R. C. Wilson, Pattern analysis with graphs: Parallel work at bern and
york, Pattern Recogn. Lett. 33(7) (2012) 833841.
72. L. He, C. Y. Han, B. Everding and W. G. Wee, Graph matching for object recognition
and recovery, Pattern Recogn. 37(7) (2004) 15571560.
73. D. Hidovi and M. Pelillo, Metrics for attributed graphs based on the maximal similarity
common subgraph, Int. J. Pattern Recogn. Artif. Intell. 18(3) (2004) 299313.
74. W. Hu, W. Hu, N. Xie and S. Maybank, Unsupervised active learning based on hierarchical
graph-theoretic clustering, IEEE Trans. Syst. Man Cybern. B 39(5) (2009) 11471161.
75. B. Jain and K. Obermayer, Consistent estimator of median and mean graph, 2010 20th
Int. Conf. Pattern Recognition (ICPR) (IEEE, 2010), pp. 10321035.
76. B. Jain and K. Obermayer, Maximum likelihood for gaussians on graphs, in Graph-Based
Representations in Pattern Recognition, LNCS, Vol. 6658 (Springer, 2011), pp. 6271.
77. B. J. Jain and K. Obermayer, Graph quantization, Comput. Vis. Image Understand.
115(7) (2011) 946961.
78. X. Jiang, K. Broelemann, S. Wachenfeld and A. Krger, Graph-based markerless registration of city maps using geometric hashing, Comput. Vis. Image Understand. 115(7)
(2011) 10321043.
79. S. Jouili and S. Tabbone, Graph embedding using constant shift embedding, in Proc.
20th Int. Conf. Recognizing Patterns in Signals, Speech, Images and Videos, ICPR'10
(Springer-Verlag, Berlin, Heidelberg, 2010), pp. 8292.
80. R. M. Cesar Jr., E. Bengoetxea, I. Bloch and P. Larraaga, Inexact graph matching for
model-based recognition: Evaluation and comparison of optimization algorithms, Pattern Recogn. 38(11) (2005) 20992113.
81. D. Justice and A. Hero, A binary linear programming formulation of the graph edit
distance, IEEE Trans. Pattern Anal. Mach. Intell. 28(8) (2006) 12001214.
82. H. Kashima, K. Tsuda and A. Inokuchi, Marginalized kernels between labeled graphs,
Int. Conf. Machine Learning, Vol. 20 (2003), p. 321.
83. D. H. Kim, I. D. Yun and S. U. Lee, Attributed relational graph matching based on the
nested assignment structure, Pattern Recogn. 43(3) (2010) 914928.
84. J. K. Kim and S. Choi, Clustering with-regular graphs, Pattern Recogn. 42(9) (2009)
20202028.
85. E. Kokiopoulou and P. Frossard, Graph-based classication of multiple observation sets,
Pattern Recogn. 43(12) (2010) 39883997.
86. E. Kokiopoulou and Y. Saad, Enhanced graph-based dimensionality reduction with
repulsion laplaceans, Pattern Recogn. 42(11) (2009) 23922402.
87. J. Konc and D. Janei, An improved branch and bound algorithm for the maximum
clique problem, MATCH Communications in Mathematical and in Computer Chemistry
58(3) (2007) 569590.
88. R. Kondor, N. Shervashidze and K. M. Borgwardt, The graphlet spectrum, in Proc. 26th
Annual Int. Conf. Machine Learning, ICML '09 (ACM, 2009), pp. 529536.
89. A. Kostin, J. Kittler and W. Christmas, Object recognition by symmetrised graph
matching using relaxation labelling with an inhibitory mechanism, Pattern Recogn.
Lett. 26(3) (2005) 381393.
90. J. Larrosa and G. Valiente, Constraint satisfaction algorithms for graph pattern
matching, Math. Struct. Comput. Sci. 12(4) (2002) 403422.
91. N. A. Laskaris and S. P. Zafeiriou, Beyond fcm: Graph-theoretic post-processing algorithms for learning and representing the data structure, Pattern Recogn. 41(8) (2008)
26302644.
1450001-34

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

92. W.-J. Lee, V. Cheplygina, D. M. J. Tax, M. Loog and R. P. W. Duin, Bridging structure
and feature representations in graph matching, Int. J. Pattern Recogn. Artif. Intell.
26(5) (2012) 122.
93. W.-J. Lee and R. P. W. Duin, A labelled graph based multiple classier system, in
Multiple Classier Systems, eds. J. A. Benediktsson, J. Kittler and F. Roli, Lecture
Notes in Computer Science, Vol. 5519 (Springer Berlin Heidelberg, 2009), pp. 201210.
94. W.-J. Lee, R. P. W. Duin and H. Bunke, Selecting structural base classiers for graphbased multiple classier systems, in Multiple Classier Systems, eds. N. Gayar, J. Kittler
and F. Roli, Lecture Notes in Computer Science, Vol. 5997 (Springer, Berlin Heidelberg,
2010), pp. 155164.
95. M. Leordeanu and M. Hebert, A spectral technique for correspondence problems using
pairwise constraints, 10th IEEE Int. Conf. Computer Vision, 2005. ICCV 2005, Vol. 2
(2005), pp. 14821489.
96. M. Leordeanu, R. Sukthankar and M. Hebert, Unsupervised learning for graph
matching, Int. J. Comput. Vis. 96(1) (2012) 2845.
97. L. Lin, X. Liu and S.-C. Zhu, Layered graph matching with composite cluster sampling,
IEEE Trans. Pattern Anal. Mach. Intell. 32(8) (2010) 14261442.
98. L. Livi and A. Rizzi, The graph matching problem, Pattern Anal. Appl. 16(3) (2012)
131.
99. M. A. Lozano and F. Escolano, A signicant improvement of softassign with diusion
kernels, in Structural, Syntactic, and Statistical Pattern Recognition (Springer, 2004),
pp. 7684.
100. M. A. Lozano and F. Escolano, Graph matching and clustering using kernel attributes,
Neurocomputing 113 (2013) 177194.
101. B. Luo, R. C. Wilson and E. R. Hancock, Spectral embedding of graphs, Pattern Recogn.
36(10) (2003) 22132230.
102. B. Luo, R. C. Wilson and E. R. Hancock, A spectral approach to learning structural
variations in graphs, Pattern Recogn. 39(6) (2006) 11881198.
103. M. M. Luqman, J.-Y. Ramel, J. Llads and T. Brouard, Fuzzy multilevel graph
embedding, Pattern Recogn. 46(2) (2013) 551565.
104. D. Macrini, S. Dickinson, D. Fleet and K. Siddiqi, Object categorization using bone
graphs, Comput. Vis. Image Understand. 115(8) (2011) 11871206.
105. P. Mahe and J.-P. Vert, Graph kernels based on tree patterns for molecules, Mach.
Learn. 75(1) (2009) 335.
106. A. Mantrach, N. van Zeebroeck, P. Francq, M. Shimbo, H. Bersini and M. Saerens, Semisupervised classication and betweenness computation on large, sparse, directed graphs,
Pattern Recogn. 44(6) (2011) 12121224.
107. A. Massaro and M. Pelillo, Matching graphs by pivoting, Pattern Recogn. Lett. 24(8)
(2003) 10991106.
108. U. Maulik, Hierarchical pattern discovery in graphs, IEEE Trans. Syst., Man, Cybern. C
38(6) (2008) 867872.
109. B. T. Messmer and H. Bunke, A decision tree approach to graph and subgraph isomorphism detection, Pattern Recogn. 32(12) (1999) 19791998.
110. S. Mimaroglu and E. Erdil, Combining multiple clusterings using similarity graph,
Pattern Recogn. 44(3) (2011) 694703.
111. M. Neuhaus and H. Bunke, Self-organizing maps for learning the edit costs in graph
matching, IEEE Trans. Syst., Man, Cybern. B 35(3) (2005) 503514.
112. M. Neuhaus and H. Bunke, Edit distance-based kernel functions for structural pattern
classication, Pattern Recogn. 39(10) (2006) 18521863.

1450001-35

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

113. M. Neuhaus and H. Bunke, Automatic learning of cost functions for graph edit distance,
Inform. Sci. 177(1) (2007) 239247.
114. M. Neuhaus, K. Riesen and H. Bunke, Novel kernels for error-tolerant graph classication, Spatial Vis. 22(5) (2009) 425441.
115. F. Nie, H. Wang, H. Huang and C. Ding, Unsupervised and semi-supervised learning via
l1-norm graph, 2011 IEEE Int. Conf. Computer Vision (ICCV), 2011, pp. 22682273.
116. B. G. Park, K. M. Lee, S. U. Lee and J. H. Lee, Recognition of partially occluded objects
using probabilistic arg (attributed relational graph)-based matching, Comput. Vis.
Image Understand. 90(3) (2003) 217241.
117. E. Pekalska and R. P. W. Duin, The Dissimilarity Representation for Pattern Recognition: Foundations and Applications, Vol. 64 (World Scientic Publishing Company,
2005).
118. H. Qiu and E. R. Hancock, Graph matching and clustering using spectral partitions,
Pattern Recogn. 39(1) (2006) 2234.
119. H. Qiu and E. R. Hancock, Graph simplication and matching using commute times,
Pattern Recogn. 40(10) (2007) 28742889.
120. R. Raveaux, S. Adam, P. Heroux and . Trupin, Learning graph prototypes for shape
recognition, Comput. Vis. Image Understand. 115(7) (2011) 905918.
121. R. Raveaux, J. C. Burie and J.-M. Ogier, A graph matching method and a graph
matching distance based on subgraph assignments, Pattern Recogn. Lett. 31(5) (2010)
394406.
122. J. Richiardi, D. Van De Ville, K. Riesen and H. Bunke, Vector space embedding of
undirected graphs with xed-cardinality vertex sequences for classication, 20th Int.
Conf. Pattern Recognition (2010), pp. 902905.
123. K. Riesen and H. Bunke, Graph classication by means of lipschitz embedding, IEEE
Trans. Syst., Man, Cybern. B 39(6) (2009) 14721483.
124. K. Riesen and H. Bunke, Classier ensembles for vector space embedding of graphs,
Multiple Classier Systems, in eds. M. Haindl, J. Kittler and F. Roli, Lecture Notes in
Computer Science, Vol. 4472 (Springer Berlin Heidelberg, 2007), pp. 220230.
125. K. Riesen and H. Bunke, Approximate graph edit distance computation by means of
bipartite graph matching, Image Vis. Comput. 27(7) (2009) 950959.
126. K. Riesen and H. Bunke, Reducing the dimensionality of dissimilarity space embedding
graph kernels, Eng. Appl. Artif. Intell. 22 (2009) 4856.
127. K. Riesen, M. Neuhaus and H. Bunke, Graph embedding in vector spaces by means
of prototype selection, in Graph-Based Representations in Pattern Recognition, eds.
F. Escolano and M. Vento, Lecture Notes in Computer Science, Vol. 4538 (Springer,
2007), pp. 383393.
128. A. Robles-Kelly and E. R. Hancock, String edit distance, random walks and graph
matching, Int. J. Pattern Recogn. Artif. Intell. 18(3) (2004) 315327.
129. A. Robles-Kelly and E. R. Hancock, Graph edit distance from spectral seriation, IEEE
Trans. Pattern Anal. Mach. Intell. 27(3) (2005) 365378.
130. A. Robles-Kelly and E. R. Hancock, A riemannian approach to graph embedding,
Pattern Recogn. 40(3) (2007) 10421056.
131. D. Rodenas, F. Serratosa and A. Sole-Ribalta, Parallel graduated assignment algorithm
for multiple graph matching based on a common labelling, in Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658 (Springer, 2011), pp. 132141.
132. M. Hossein Rohban and H. R. Rabiee, Supervised neighborhood graph construction for
semi-supervised classication, Pattern Recogn. 45(4) (2012) 13631372.
133. L. Rossi, A. Torsello and E. R. Hancock, A continuous-time quantum walk kernel
for unattributed graphs, in Graph-Based Representations in Pattern Recognition,
1450001-36

Graphs in Pattern Recognition in the Last 10 Years

134.
135.

136.

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

137.

138.

139.
140.
141.

142.
143.

144.
145.

146.
147.
148.

149.

150.

151.
152.

eds. W. G. Kropatsch, N. M. Artner, Y. Haxhimusa and X. Jiang, Lecture Notes in


Computer Science, Vol. 7877 (Springer, Berlin Heidelberg, 2013), pp. 101110.
S. Rota Bul, M. Pelillo I. M. Bomze, Graph-based quadratic optimization: A fast
evolutionary approach, Comput. Vis. Image Understand. 115(7) (2011) 984995.
A. Sanfeliu, R. Alquezar, J. Andrade, J. Climent, F. Serratosa and J. Verges, Graphbased representations and techniques for image processing and image analysis, Pattern
Recogn. 35(3) (2002) 639650.
A. Sanfeliu, F. Serratosa and R. Alquezar, Second-order random graphs for modeling
sets of attributed graphs and their application to object learning and recognition, Int. J.
Pattern Recogn. Artif. Intell. 18(3) (2004) 375396.
G. Sanrom, R. Alquezar and F. Serratosa, A new graph matching method for point-set
correspondence using the em algorithm and softassign, Comput. Vis. Image Understand.
116(2) (2012) 292304.
M. De Santo, P. Foggia, C. Sansone and M. Vento, A large database of graphs and its
use for benchmarking graph isomorphism algorithms, Pattern Recogn. Lett. 24(8)
(2003) 10671079.
E. R. Scheinerman and K. Tucker, Modeling graphs using dot product representations,
Comput. Stat. 25 (2010) 116.
T. B. Sebastian, P. N. Klein and B. B. Kimia, Recognition of shapes by editing their
shock graphs, IEEE Trans. Pattern Anal. Mach. Intell. 26(5) (2004) 550571.
F. Serratosa, R. Alquezar and A. Sanfeliu, Synthesis of function-described graphs and
clustering of attributed graphs, Int. J. Pattern Recogn. Artif. Intell. 16(6) (2002) 621
655.
F. Serratosa, R. Alquezar and A. Sanfeliu, Function-described graphs for modelling
objects represented by sets of attributed graphs, Pattern Recogn. 36(3) (2003) 781798.
F. Serratosa, A. Sol-Ribalta and X. Corts, Automatic learning of edit costs based on
interactive and adaptive graph recognition, in Graph-Based Representations in Pattern
Recognition, LNCS Vol. 6658 (Springer, 2011), pp. 152163.
F. Shang, L. C. Jiao and F. Wang, Graph dual regularization non-negative matrix
factorization for co-clustering, Pattern Recogn. 45(6) (2012) 22372250.
N. Shervashidze, T. Petri, K. Mehlhorn, K. M. Borgwardt and S. V. N. Viswanathan,
Ecient graphlet kernels for large graph comparison, Int. Conf. Articial Intelligence
and Statistics (2009), pp. 488495.
M. Shiga and H. Mamitsuka, Ecient semi-supervised learning on locally informative
multiple graphs, Pattern Recogn. 45(3) (2012) 10351049.
M. Skomorowski, Syntactic recognition of distorted patterns by means of random graph
parsing, Pattern Recogn. Lett. 28(5) (2007) 572581.
A. Sole-Ribalta and F. Serratosa, Exploration of the labelling space given graph edit
distance costs, in Graph-Based Representations in Pattern Recognition, LNCS, Vol.
6658 (Springer, 2011), pp. 164174.
A. Sole-Ribalta and F. Serratosa, Models and algorithms for computing the common
labelling of a set of attributed graphs, Comput. Vis. Image Understand. 115(7) (2011)
929945.
A. Sole-Ribalta and F. Serratosa, Graduated assignment algorithm for multiple graph
matching based on a common labeling, Int. J. Pattern Recogn. Artif. Intell. 27(1) (2013)
127.
A. Sole-Ribalta, F. Serratosa and A. Sanfeliu, On the graph edit distance cost: Properties and applications, Int. J. Pattern Recogn. Artif. Intell. 26(5) (2012) 121.
C. Solnon, All dierent-based ltering for subgraph isomorphism, Artif. Intell. 174
(2010) 850864.
1450001-37

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

P. Foggia, G. Percannella & M. Vento

153. B. Strug, Using kernels on hierarchical graphs in automatic classication of designs, in


Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658 (Springer,
2011), pp. 335344.
154. S. S. Tabatabaei, M. Coates, and M. Rabbat, Ganc: Greedy agglomerative normalized
cut for graph clustering, Pattern Recogn. 45(2) (2012) 831843.
155. J. Tang, B. Jiang and B. Luo, Graph matching based on dot product representation of
graphs, in Graph-Based Representations in Pattern Recognition, LNCS, Vol. 6658
(Springer, 2011), pp. 175184.
156. J. Tang, B. Jiang, A. Zheng and B. Luo, Graph matching based on spectral embedding
with missing value, Pattern Recogn. 45(10) (2012) 37683779.
157. L. Torresani, V. Kolmogorov and C. Rother, Feature correspondence via graph
matching: Models and global optimization, Computer Vision ECCV 2008 (2008),
pp. 596609.
158. A. Torsello and E. R. Hancock, Graph embedding using tree edit-union, Pattern Recogn.
40(5) (2007) 13931405.
159. C.-C. Tseng, J.-C. Chen, C.-H. Fang and J.-J. James Lien, Human action recognition based
on graph-embedded spatio-temporal subspace, Pattern Recogn. 45(10) (2012) 36113624.
160. J. R. Ullmann, Bit-vector algorithms for binary constraint satisfaction and subgraph
isomorphism, J. Exp. Algorithmics 15 (2011) 1.6:1.11.6:1.64.
161. B. J. van Wyk and M. A. van Wyk, Kronecker product graph matching, Pattern
Recogn. 36(9) (2003) 20192030.
162. B. J. van Wyk and M. A. van Wyk, A pocs-based graph matching algorithm, IEEE
Trans. Pattern Anal. Mach. Intell. 26(11) (2004) 15261530.
163. M. A. van Wyk and B. J. van Wyk, A learning-based framework for graph matching, Int.
J. Pattern Recogn. Artif. Intell. 18(3) (2004) 355374.
164. M. A. van Wyk, T. S. Durrani and B. J. van Wyk, A rkhs interpolator-based graph
matching algorithm, IEEE Trans. Pattern Anal. Mach. Intell. 24(7) (2002) 988995.
165. B. Wang, F. Pan, K.-M. Hu and J.-C. Paul, Manifold-ranking based retrieval using kregular nearest neighbor graph, Pattern Recogn. 45(4) (2012) 15691577.
166. C. Wang, Z. Song, S. Yan, L. Zhang and H.-J. Zhang, Multiplicative nonnegative greph
embedding, IEEE Conf. Computer Vision and Pattern Recognition, 2009. CVPR 2009
(2009), pp. 389396.
167. F. Wang, C. Ding and T. Li, Integrated KL (K-means-Laplacian) clustering: A new
clustering approach by combining attribute data and pairwise relations, in Proc. SDM
vol. 9 (2009), pp. 3848.
168. J. T. L. Wang, K. Zhang, G. Chang and D. Shasha, Finding approximate patterns in
undirected acyclic graphs, Pattern Recogn. 35(2) (2002) 473483.
169. M. Weber, M. Liwicki and A. Dengel, Indexing with well-founded total order for faster
subgraph isomorphism detection, in Graph-Based Representations in Pattern Recognition, Vol. 6658 LNCS (Springer, 2011), pp. 185194.
170. R. C. Wilson, E. R. Hancock and B. Luo, Pattern vectors from algebraic graph theory,
IEEE Trans. Pattern Anal. Mach. Intell. 27(7) (2005) 11121124.
171. R. C. Wilson and P. Zhu, A study of graph spectra for comparing graphs and trees,
Pattern Recogn. 41(9) (2008) 28332841.
172. B. Xiao, E. R. Hancock and R. C. Wilson, Graph characteristics from the heat kernel
trace, Pattern Recogn. 42(11) (2009) 25892606.
173. B. Xiao, S. Y.-Zhe and P. Hall, Learning invariant structure for object identication by
using graph methods, Comput. Vis. Image Understand. 115(7) (2011) 10231031.
174. Y. Xiao, H. Dong, W. Wu, M. Xiong, W. Wang and B. Shi, Structure-based graph
distance measures of high degree of precision, Pattern Recogn. 41(12) (2008) 35473561.
1450001-38

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

Graphs in Pattern Recognition in the Last 10 Years

175. S. Yan, D. Xu, B. Zhang, H.-J. Zhang, Q. Yang and S. Lin, Graph embedding and
extensions: A general framework for dimensionality reduction, IEEE Trans. Pattern
Anal. Mach. Intell. 29(1) (2007) 4051.
176. J. Yang, S. Yang, Y. Fu, X. Li and T. Huang. Non-negative graph embedding, IEEE
Conf. Computer Vision and Pattern Recognition, 2008. CVPR 2008 (2008), pp. 18.
177. Q. You, N. Zheng, L. Gao, S. Du and Y. Wu, Analysis of solution for supervised graph
embedding, Int. J. Pattern Recogn. Artif. Intell. 22(7) (2008) 12831299.
178. G. Yu, H. Peng, J. Wei and Q. Ma, Mixture graph based semi-supervised dimensionality
reduction, Pattern Recogn. Image Anal. 20 (2010) 536541.
179. S. Zampelli, Y. Deville and C. Solnon, Solving subgraph isomorphism problems with
constraint programming, Constraints 15 (2010) 327353.
180. H. Zanghi, C. Ambroise and V. Miele, Fast online graph clustering via Erdos-Renyi
mixture, Pattern Recogn. 41(12) (2008) 35923599.
181. H. Zanghi, S. Volant and C. Ambroise, Clustering based on random graph model
embedding vertex features, Pattern Recogn. Lett. 31(9) (2010) 830836.
182. M. Zaslavskiy, F. Bach and J.-P. Vert, A path following algorithm for the graph
matching problem, IEEE Trans. Pattern Anal. Mach. Intell. 31(12) (2009) 22272242.
183. D. Zhou, O. Bousquet, T. N. Lal, J. Weston and B. Sch
olkopf, Learning with local and
global consistency, in Advances in Neural Information Processing Systems 16 (MIT
Press, 2004), pp. 321328.
184. F. Zhou and F. De la Torre, Factorized graph matching, 2012 IEEE Conf. Computer
Vision and Pattern Recognition (CVPR) (2012), pp. 127134.
185. L. Zhuang, H. Gao, Z. Lin, Y. Ma, X. Zhang and N. Yu, Non-negative low rank and
sparse graph for semi-supervised learning, 2012 IEEE Conf. Computer Vision and
Pattern Recognition (CVPR), (2012), pp. 23282335.
186. D. Conte, P. Foggia, G. Percannella and M. Vento, A method based on the indirect
approach for counting peoples in crowded, scenes, IEEE Conf. Advanced Video and
Signal Based Surveillance (AVSS, 2010), pp. 111118.

Pasquale Foggia received his Laurea degree


(cum laude) in Computer
Engineering in 1995, and
his Ph.D. in Electronic
and Computer Engineering in 1999, from the
\Federico II" University
of Naples, Naples, Italy.
He was an Associate Professor of Computer Science with the Department of Computer Science
and Systems, University of Naples, from 2004 to
2008, and has been with the University of Salerno, Fisciano, Italy, since 2008. His current research interests include basic methodologies and
applications in the elds of computer vision and
pattern recognition. He is the author of several
research papers on these subjects. Dr. Foggia has
been a member of the International Association
for Pattern Recognition (IAPR), and has been
involved in the activities of the IAPR Technical

Committee 15 (Graph-based Representations in


Pattern Recognition) since 1997.

Gennaro Percannella
received his Laurea degree
(cum laude) in Electronic
Engineering in 1998, and
his Ph.D. in Electronic and
Computer Engineering in
2002, both from the University of Salerno, Fisciano, Italy. He is currently
an Assistant Professor of
Computer Science and
Articial Vision at the University of Salerno,
where he is a member of the Articial Vision Research Group. He has authored more than 60 research papers in international journals and
conference proceedings in the eld of computer
vision and pattern recognition. His current

1450001-39

P. Foggia, G. Percannella & M. Vento

Int. J. Patt. Recogn. Artif. Intell. 2014.28. Downloaded from www.worldscientific.com


by 122.176.242.35 on 07/12/15. For personal use only.

interests include the areas of pattern recognition,


video and audio analysis, and machine learning in
articial vision, with applications like medical and
biological image analysis, robotic vision and intelligent video surveillance. Dr. Percannella is a
member of International Association for Pattern
Recognition (IAPR). He serves as a referee for
many relevant journals and conferences and is on
the program committees of several international
conferences.

Mario Vento, IAPR


Fellow, received his Ph.D.
in Electronic and Computer Engineering in 1988
from the University of
Naples \Federico II ",
Naples, Italy. Currently,
he is a Full Professor of
Computer Science and
Articial Intelligence at
the University of Salerno
(Italy), where he is also the Coordinator of the
Articial Vision Laboratory. Prof. Vento is a
Fellow Scientist of the International Association
for Pattern Recognition (IAPR) for his contributions to Graph based Representations in
Pattern Recognition. He has served as the
Chairman of the IAPR Technical Committee
TC15 (Graph Based Representation in Pattern
Recognition) from 2002 to 2006. He has authored
over 170 research papers in international journals and conference proceedings. His current research interests include articial intelligence,
image analysis, pattern recognition, machine
learning, and computer vision. More specically,
his research activity covers real-time video
analysis and interpretation for trac monitoring
and video surveillance applications, classication
techniques, either statistical, syntactic and
structural, exact and inexact graph matching,
multiexpert classication, and learning methodologies for structural descriptions. He has been
an Associate Editor of the Electronic Letters on
Computer Vision and Image Analysis since 2003
and serves as a referee for many relevant journals
in the eld of pattern recognition and machine
intelligence.

1450001-40

You might also like