Professional Documents
Culture Documents
a r t i c l e
i n f o
Article history:
Available online 2 June 2011
Keywords:
2D and 3D pattern recognition
Hough Transform
SLAM
Geometric algebra
Robot vision
a b s t r a c t
This paper presents the application of 2D and 3D Hough Transforms together with conformal geometric
algebra to build 3D geometric maps using the geometric entities of lines and planes. Among several existing techniques for robot self-localization, a new approach is proposed for map matching in the Hough
domain. The geometric Hough representation is formulated in such a way that one can easily relate it
to the conformal geometric algebra framework; thus, the detected lines and planes can be used for
algebra-of-incidence computations to nd geometric constraints, useful when perceiving special congurations in 3D visual space for exploration, navigation, relocation and obstacle avoidance. We believe that
this work is very useful for 2D and 3D geometric pattern recognition in robot vision tasks.
2011 Elsevier B.V. All rights reserved.
1. Introduction
Mobile robots are equipped with different input devices to
sense their surrounding environment. The laser range-nder is
widely used for this task, due to its precision and its wide capture
range. Our procedure merges the data obtained by the laser and
the stereo camera system to build a 3D virtual map characterized
with geometric shapes such as walls and surrounding objects that
were obtained by these sensors. This paper is an extension of previous works Bernal-Marin and Bayro-Corrochano (2009a,b).
Using the conformal geometric algebra (CGA), we can represent
different geometric entities like lines, planes, circles, and spheres,
including the line segments (as a pair-of-points). This framework
also allows us to formulate transformations (rotation, translation)
using spinors or versors. By using those geometric primitives, we
can represent complex 2D and 3D shapes.
There are basically two types of maps suitable for localization:
occupancy grid maps and maps based on geometric primitives.
This work uses the latter. The knowledge of its pose (position
and orientation) while the robot builds a map is a crucial issue
for autonomous navigation. Markov Localization (Fox et al.,
1999b) is a notable example for a method that do not rely on the
extraction of specic features from the sensor data, in which an explicit representation of the probabilistic distribution of the robots
pose is periodically updated according to the sensor readings, and
Corresponding author.
E-mail addresses: mbernal@gdl.cinvestav.mx (M. Bernal-Marin), edb@gdl.
cinvestav.mx (E. Bayro-Corrochano).
0167-8655/$ - see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.patrec.2011.05.014
the Monte Carlo Localization (Fox et al., 1999a) that is a more efcient version of Markov Localization, because it is based on extraction and matching of specic features of only a subset of all
possible poses of the robot.
Lines are often used for localization in polygonal environment.
For instance in (Gutmann et al., 2001) an algorithm for computing
robot poses by line matching is presented. In this work, for the robot localization, we use the lines characteristics in the Hough domain (Hough, 1962) (h, q) to nd the current robot position in the
2D map.
When the environment map has been captured and one starts
the mobile robot in a different place, the problem is to relocalize
itself in the same map. In this work, we use the lines captured in
the Hough domain to perform the relocation inside the 2D map.
Also Grisetti et al. (2002), present a robots localization computing
the Hough Transform and nding correspondences in a map.
When the relocation task is performed, the mobile robot uses
the 3D Hough Transform to represent planes and include them in
the 3D map. Our algorithm exploits the 2D and 3D Hough space
to localize and nd out a particular 3D space conguration, which
can be used to recognize spatial areas and objects and navigate
safely. We present experiments using real data which validate
the efciency of our approach.
The rest of the paper is organized as: Section 2, we give an outline of the geometric algebra and the conformal geometric algebra.
2D and 3D Hough Transform is presented in Section 3. Section 4
describes the approach for 3D map building. The procedure for robot relocation is explained in Section 5. Section 6 shows a new 3D
object representation. Section 7 reports experiment results and
Section 8 is devoted to the concluding remarks.
2214
Entity
IPNS
Sphere
s p 12 p2 2 e
x x 12 x2 e e0
Point
8
12R
>
>
>
<
1 2 R
ei ej :
>
0
2R
>
>
:
eij ei ^; ej
P = e ^ a ^ b ^ c
Line
L = P1 ^ P2=rIE + eMIE
r=ab
M=a^b
L = e ^ a ^ b
Circle
z = s1 ^ s2
z = a ^ b ^ c
hMii ;
z
qz e^z
2
PP = a ^ b
PP = s1 ^ s2 ^ s3
1
x x x2 e e0 :
2
for i j:
ii1
2
sz = (ez)z
for i j 2 fp q 1; . . . ; ng;
e 1
h Mi
i
s = a ^ b ^ c ^ d
x Ex 12 x2 e e0 IE
e0
P = NIE de
N = (a b) ^ (a c)
d = (a ^ b ^ c)IE
P-pair
for i j 2 fp 1; . . . ; p qg;
Plane
ab a b a ^ b
and the two parts; the inner product a b is symmetric part, while
the wedge product (outer product) a ^ b is the antisymmetric part.
In Gp;q;r the geometric product of two basis is dened as
OPNS
for M 2 Gn ; 0 6 i 6 n:
h
h
1
sen
R e2lh cos
l;
2
2
et
T e2
et
;
2
~;
x0 r xr
M TR:
The translator, rotor, and motor (all of them versors) are elements from G
4;1;0 , and they dene an algebra called motor algebra.
This algebra greatly simplies the successive computation of rotations and translations, applying only the geometric product in consecutive versors, giving the nal result another versor of this
algebra, where all the involved transformations are compactly represented just with one versor.
Vector calculus is a coordinate-dependent mathematical system
and its cross product cannot be extended to higher dimensions.
The representation of geometric primitives is based on lengthy
equations and for linear transformations one uses matrix representation with redundant coefcients. In contrast, conformal geometric algebra, a coordinate-free system, provides a fruitful description
language to represent primitives and constraints in any dimension,
and by using successive reections with bivectors, one builds versors to carry out linear transformations avoiding redundant
coefcients.
3. Hough transform: overview
The Hough Transform (Hough, 1962) is an effective method to
identify the location and orientation of lines (Duda and Hart,
1972). Next we describe the 2D, 3D and the relation of Hough
space and Conformal Geometry Algebra.
3.1. 2D Hough transform
The Hough Transform is the mapping of a line L of R2 into a single point in the 2D Hough space (h, q), where q 2 R and h 2 [0, 2p).
The equation describing a line L in R2 is given by
L : y ax b;
where the scalars a, b are the line parameters. This line L can then be
represented in the parameter space as a point with coordinates
(a, b). In an opposite way, a point xi ; yi 2 R2 can be represented
in the parameter space as a line L0 as follows:
L0 : b xi a yi ;
2215
2216
if 0 6 h hR 6 2p;
h0 h hR ;
q0 q T x cosh hR T y sinh hR ;
if h hR > 2p;
h0 h hR 2p;
q0 q T x cosh hR T y sinh hR ;
if h hR < 0;
h0 h hR 2p;
q0 q T x cosh hR T y sinh hR ;
It is important to notice here that if hR = 0 (i.e. the robot does not
rotate) then h0 = h (i.e. h does not change), and conversely if Tx = Ty = 0
(i.e.the robot does not translate) then q0 = q (i.e. q does not change).
3.2. 3D Hough transform
Extending the principle of the line, we can represent a plane
embedded in R3 as a point (a, b, c) in the 3D parameter space
P : z ax by c:
ax by c dz 0;
it cannot be represented in the new parameter space as the coefcient d if the z-axis is equal to 0. Thus, we again use the normal form
of the plane:
10
a1 x1 a2 x2 an xn a x aT x z;
First of all, we should underline the fact that any Hough space is
just a parameterized space, which can be put in connection with
conformal geometric algebra for the purpose of carrying out algebraic computations using the hyperplanes coded in the Hough
space in question. That means that an nD Hough space can be used
together with a Gn1;1 conformal geometric algebra. Our formalism
was inspired by the concept of duality between hyperplanes and
points both represented in homogeneous coordinates, as Eq. (12)
clearly shows, namely, their dual interdependency and geometric
relationship; i.e., the point lies on the plane. Now, if we imagine
in Rn centered hyperspheres of varying radius (perfect set), we
can consider the radius as Hesse distances on any point on the surface of any hypersphere as a unique hyperplane; see Fig. 1c. Now,
any point lying on this hyperplane fullling P ^ P = 0 increments a
counter of this hyperplane. This model is a straightforward extension of the standard 2D Hough Transform. Each counter can also
store (n 1) coordinate coefcients of each point solely to identify
anytime the point with respect to the hyperplane plane (1 DOF less
for each point). For the 3D space that Eq. (10) shows, we can represent the plane in terms of two polar coordinates and the Hesse
distance (h, w, q); see Fig. 1d. Once stable planes are found using robust algorithms like RANSAC (Tarsha-Kurdi et al., 2007), we can
use the parameters of the plane (Hesse distance and plane orientation) to represent a plane in 3D conformal geometric algebra G31;1
and look for geometric constraints among points, lines, and planes.
To do so, we make use of the algebra of incidence using the meet
and join operations. The discovery of geometric constraints is extremely useful to help the robot navigate in places with obstacles.
Section 7 contains interesting applications of this new approach.
4. Environment map building
We use a mobile robot equipped with a laser range-nder sensor and stereo camera system mounted on a pan-tilt head (see
Fig. 2). Each of these input devices has its own coordinate system.
We apply the method of hand-eye calibration (Bayro-Corrochano
et al., 2000; Bernal-Marin and Bayro-Corrochano, 2009a,b) in order
to get the center coordinates of each device related to a global robot coordinate system. While the mobile robot Geometer is
exploring new areas and gets the data from the laser range-nder
and the stereo camera system, two maps are building simultaneously: one with local coordinates L (according to the robots
11
x; 1T a; z x; 1T n; h X T P 0:
12
This represents the constraint of a homogeneous point X 2 Rn1 lying on the plane P = [n, h]T, where n 2 Rn and h 2 R are the normal
and the Hesse distance, respectively.
2217
h ho herror ;
13
T T o T error ;
14
where ho and To are the rotation angle and the translation vector given by the odometer, and herror and Terror are the value of the correction error generated by the comparison of the actual laser reading
(line segments in local map) and the prior reading (line segments
in global map). Using a perpendicular line to (x, y)-plane as the rotation axis and using (13) as the angle, and adding a third xed coordinate to (14), we can apply these values in (3) to get the translator
Tpos and the rotor Rpos, which represent the robot movement.
4.1. Line detection
To extract line segments from laser range points, we use a
recursive line-splitting method, as shown in (Zhang and Ghosh,
2000), instead of Hough Transform, because this is a speedy and
correctness algorithm that performs divide-and-conquer algorithm
(Nguyen et al., 2007). We map every endpoint of the line segments
to CGA to get the pair-of-points (PP) entity and store it in a local
map L. As the endpoints are 2D points, we take the last coordinate
in R3 and give the 0 value to x the point in that plane. Now the
local map L has every line segment represented as a pair-of-points
in CGA, and we can apply any transformation to it (rotation, translation). While the map is being built, the collected data are stored
in L with regard to the initial position. Subsequent records taken
from the laser range-nder replace the actual local map. When a
new local map L is taken, it is mapped to the global coordinate
system using (19) and perform line matching. To match line segments (namely pair-of-points), we use a property of spheres. Let
s1 ; s2 2 G4;1;0 , the result of their product (15) tell us if they intersect,
are tangent, or do not intersect.
8
>
< < 0 if s1
s1 ^ s2 2 0 if s1
>
:
> 0 if s1
and s2 intersect;
and s2
are tangent;
and s2
do not intersect:
15
SPP
PP
:
PP ^ e
g
M cl Rpos M lsr R
pos ;
17
18
19
where (17) is the translation and rotation motor toward the lasers
center, (18) is the movement of robot using the laser range-nder,
and (19) is the motor that leads to the source of the laser sensor
in the global coordinate system. Using (19) with any geometric
entity (point, line, circle) recorded with the laser range-nder
sensor, we can move it easily to the global coordinate system using
g
then the form x0 M lu x M
lu .
As we are dealing in a 3D real world and the laser range-nder
only shows a plane measure, we can add a virtual wall (Fig. 7a) to
the shapes from the laser range-nder to get a 3D visual sense of
the walls that are inside the virtual world. If a new laser rangender is mounted on the mobile robot or if the laser range-nder
is moved to another place in the mobile robot, it is easy to get
the new motor that maps the data from the laser range-nder to
the global map, only updating the motor Mlsr that represents the
rotation and translation from the center of the global coordinate
system to the lasers center, and nally recalculating (19).
4.2. Stereo camera system
The pan-tilt unit has two degrees of freedom which can be expressed as two rotation, one for pan and other for tilt. This rotation
we can modeled using rotors as (3). Let Rpan be the rotor for the
pan movement and let Rtilt for the tilt movement. Applying this ro-
16
2218
tors using the geometric product we can model all the pant-tilt
system. We apply the method of hand-eye calibration (Bayro-Corrochano et al., 2000), to get the axis from the pan-tilt unit and to compute its intersection (or the closet point between the rotation axis),
which help us to build a translation from this intersection to the
source of the stereo camera system (Teye translator). Then, we develop a motor that maps any entities taken from the stereo camera
system to the global coordinates system;
g
T ap Rpos T axis R
pos ;
20
21
22
23
24
where (20) is the translation to the point that has the minimum distance to the axis of pan-tilt, taking into account rotation of the robot
position; (21) is the rotor resulting of all the spins that has done so
much in the position of the robot, as in the pan-tilt; (22) is the
translation to the left camera of the stereo camera system taking
into account all the movements that had the system; (23) is the
movements motor of the robot, along with the pan-tilt; (24) is the
complete movement motor of the robot. All these relations are depicted in Fig. 5.
Any point captured by the cameras in any angle of the pan-tilt
unit, in any position of the robot can be mapped from the stereo
camera system to global coordinate system using the form
g
x0 M su x M
su :
25
Using the CGA we can capture all the entities showed in the Table 1, using the OPNS form. By capturing the 3D objects using its
representative points we can represent points, line segments
(pair-of-points), lines, circles, planes, spheres in the frame of stereo
camera system and then take them to the global coordinate system
using (25).
According to Section 3.1, the Hough Transform is the parameterization of a line from the (x, y)-plane (a Cartesian plane) to the
(h, q)-plane (the Hough domain). The line segments from the
map are transformed to the Hough domain, dening the transformation in the domain of h 2 [0, 2p), so every line segment L in
(x, y) corresponds to a point (h, q). If only the angle h varies, the value of q keeps constant. So, given a previously captured map G (global map) in terms of 2D lines and a new captured map L (new local
map), the difference between them is an angle Dh and a displacement Dx and Dy, which affects the Dq value. Fig. 6 depicts lines
from Global map (a, b, c) and Local map (a0 , b0 , c0 ), and their Hough
representation. Using the Property 1 of Hough Transform, the difference of an angle in the Hough domain is dened as follows:
8
>
< ha hb 2p if case 1;
Dhha ; hb ha hb 2p if case 2;
>
:
ha hb
if other;
26
27
This step can be seen as the difference between the actual map
and the previously captured map.
Build a new global map: adding all the elements of D(h,q) to an
0
element li 2 LHT and storing it in GHT
as shown in (28):
i
0
28
This gives us a displacement of the actual map, close to the global map.
Get an error of displacement ni: the angle Dh has been shifted in
0
0
GHT
, so we decrease GHT by the value of GHT
. The goal is to
i
i
reduce this error using
LHT 0:
GHT
i
29
Populate the vote matriz V: Let V be a zero-vote matrix of dimension jGHT j jLHT j, whose votes are given if the error of the displacement is less than a threshold:
ni < nh ; nq ;
30
Repeat the last three steps for each line in LHT . Finally, when all
the lines have been displaced and voted, we take the maximum value per column from V, where the row position of it, corresponds to
a line in GHT . This is the line correspondence, if the value is null,
there is no matching.
Now, we get all the information about which lines are matching,
and with this data we can move and rotate the robot to the right
place on the map. To compute the rotation angle, we use the angle
of matching lines. Then, with it, build a new rotor to turn the robot
and the local map L (line segments) to the new position in global
map. So far, we have the orientation of the mobile robot and only
remains to know the displacement position. We can nd the displacement Dx and Dy using the closest point to the origin in the
matching lines, and generate a translation vector t. The closest
point to the origin on a line in CGA can be calculated by
t L E e LIE :
31
2219
L PP ^ e
32
to get the line in CGA and perform (31). With this translation vector
t, we make a translator and apply it to the local map and the mobile
robot. Finally, the robot is located in the correct place on the map, so
we can continue with navigation within the environment. In Fig. 7b,
we can see the relocation evolution, where (a) shows the robot initial
position, which is taking a sample of the environment, (b) generates
the line segments of the actual environment, (c) loads a previous map
to perform matching (here we can see that the mobile robot is displaced and turned in a random place), and (d) locate and put the robot in the correct place on the map (here the robot has located itself
in the previous environment and is placed in the right place).
Fig. 8. A mobile robot looking at a workplace and their representation in Hough space (second column) and its variant angle features (third column).
2220
Fig. 9. A stairway representation and their Hough Transform viewed from different angles (third column u affected only).
2221
Fig. 10. An indoor entry hall representation and their Hough Transform viewed from different angles (third column u affected only).
features should one consider as relevant that make an environment or object distinguishable? Depending on the context, one
can take into account color, texture, Fourier or wavelet coefcients,
etc. With robot vision, it is desirable to reduce the feature vector to
classify objects in real time. For our robot navigation tasks, we have
chosen only the objects lines and planes as basic geometric features (see Fig. 8). Even though this yields a smaller number of features, it has proven to be very helpful for the description of
complex 3D objects of ofces and industrial environments. By coding these structures using planes, the robot can, at another time,
recognize the environment or object by applying a classier which
is fed with a feature vector built with features taken from the 3D
parameter Hough space.
2222
Fig. 11. (a) (From left rst column) Robot view. (b) (Second column) Stereo images and virtual world. (c) (Third column) Planes represented in R3 . (d) (four column) Planes in
3D parameter space.
Fig. 12. (a) (Top row) Robot perception of environments. (b) (Middle row) Planes represented in R3 . (c) (Bottom row) Planes in 3D parameter space.
students places. The robot is passing through this corridor that has
many planes along its way. These planes are represented as points
in the 3D Hough space, (see Fig. 11d) and processed by our algorithm. After the robot sensed the planes, Fig. 11b depicts them in
2223
Acknowledgments
The authors gratefully acknowledge the support of CINVESTAV
and the project CONACYT-2007-1 No. 82084, Cognitive and geometric methods for humanoid perception, learning, control and
action.
References
Bayro-Corrochano, E., 2010. Geometric Computing: For Wavelet Transforms, Robot
Vision, Learning, Control and Action, rst ed. Springer, ISBN 978-1-84882-9282.
Bayro-Corrochano, E., Daniilidis, K., Sommer, G., 2000. Motor algebra for 3d
kinematics: The case of the hand-eye calibration. J. Math. Imaging Vision 13 (2),
79100.
Bernal-Marin, M., Bayro-Corrochano, E., 2009a. Machine learning and geometric
technique for slam. In: Bayro-Corrochano, E., Eklundh, J.-O. (Eds.), CIARP,
Lecture Notes in Computer Science, vol. 5856. Springer, pp. 851858.
Bernal-Marin, M., Bayro-Corrochano, E., 2009b. Visual and laser guided robot
relocalization using lines, hough transformation and machine learning
techniques. In: IROS. IEEE, pp. 41424147.
Duda, R.O., Hart, P.E., 1972. Use of the hough transformation to detect lines and
curves in pictures. Comm. ACM 15, 1115.
Fox, D., Burgard, W., Dellaert, F., Thrun, S., 1999a. Monte carlo localization: Efcient
position estimation for mobile robots. In: Proc. National Conference on Articial
Intelligence (AAAI99), Orlando, Florida, USA, pp. 343349.
Fox, D., Burgard, W., Thrun, S., 1999b. Markov localization for mobile robots in
dynamic environments. J. Artif. Intell. Res. 11, 391427.
Grisetti, G., Iocchi, L., Nardi, D., 2002. Global hough localization for mobile robots in
polygonal environments. In: Proc. IEEE International Conference on Robotics
and Automation, ICRA 2002, Washington, DC, USA, pp. 353358.
Gutmann, J.-S., Weigel, T., Nebel, B., 2001. A fast, accurate and robust method for
self-localization in polygonal environments using laser range nders. Adv.
Robot. 14 (8), 651667.
Hough, P., 1962. Methods and means for recognizing complex patterns. US Patent
3,069,654.
Iocchi, L., Mastrantuono, D., Nardi, D., 2001. A probabilistic approach to hough
localization. In: Proc. IEEE International Conference on Robotics and
Automation, ICRA 2001, Seoul, Korea. IEEE, pp. 42504255.
Nguyen, V., Gchter, S., Martinelli, A., Tomatis, N., Siegwart, R., 2007. A comparison
of line extraction algorithms using 2d range data for indoor mobile robotics.
Auton. Robots 23 (2), 97111.
Rosten, E., Porter, R., Drummond, T., 2010. Faster and better: A machine learning
approach to corner detection. IEEE Trans. Pattern Anal. Machine Intell. 32, 105
119.
Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P., 2007. Hough-transform and extended
ransac algorithms for automatic detection of 3d building roof planes from lidar
data. In: Proc. International Society for Photogrammetry and Remote Sensing,
Workshop on Laser Scanning, Houston, Texas, USA, vol. XXXVI. pp. 407412.
Zhang, L., Ghosh, B.K., 2000. Line segment based map building and localization
using 2d laser rangender. In: Proc. 2000 IEEE International Conference on
Robotics and Automation, ICRA 2000, San Francisco, CA, USA, pp. 25382543.