You are on page 1of 89

The GIS files

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 1
Responsibility for this document
Danny Hyam, Ordnance Survey Technical Promotions Media Manager, is responsible for the
content of this document.

Change history
Version Date Summary of change
1.0 May 2002 First issue
1.1 Oct 2002 Minor change
2.0 Mar 2003 Second issue

Content
This document consists of 89 pages.

Distribution
The data file for this document is archived by Corporate Publishing as D01100.doc.

Approval for issue

Issued by Danny Hyam.

Trademarks
Ordnance Survey, the OS Symbol, OSGB36, Land-Line, Strategi, OSCAR and OSCAR
Route-Manager are registered trademarks and ADDRESS-POINT, Boundary-Line,
Get-a-map, Digital National Framework, DNF, Meridian, MiniScale, OS and OS MasterMap
are trademarks of Ordnance Survey, the national mapping agency of Great Britain.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 2
Contents
Section Page no
1 Getting to grips with GIS............................................................................................... 7
1.1 Introduction ....................................................................................................... 7
1.2 In the beginning.... there were maps................................................................. 7
1.2.1 Map types............................................................................................. 7
1.2.2 Map features ........................................................................................ 8
1.2.3 Map information ................................................................................... 8
1.3 Introducing raster and vector ............................................................................ 9
1.3.1 Maps in bits .......................................................................................... 9
1.3.2 Vector data......................................................................................... 10
1.3.3 Raster data......................................................................................... 11
1.3.4 Vector v Raster .................................................................................. 11
1.3.5 Raster can be intelligent..................................................................... 12
1.4 GIS = software + data.... a formula for success .............................................. 12
1.4.1 Building blocks ................................................................................... 12
1.4.2 GIS evolution...................................................................................... 13
1.5 Different types of GIS data.............................................................................. 14
1.5.1 Data sources ...................................................................................... 14
1.5.2 Ordnance Survey data and GIS ......................................................... 15
1.5.3 Linking data........................................................................................ 16
1.6 The significance of scale................................................................................. 17
1.6.1 Scale basics ....................................................................................... 17
1.6.2 Scale of capture ................................................................................. 17
1.6.3 Generalisation .................................................................................... 17
1.6.4 Be careful with scale .......................................................................... 19
2 Geographical data ...................................................................................................... 20
2.1 Introduction ..................................................................................................... 20
2.2 Data capture from maps.................................................................................. 20
2.2.1 Scanning ............................................................................................ 21
2.2.2 Digitising............................................................................................. 21
2.2.3 Vectorisation ...................................................................................... 22
2.3 Surveying and remote sensing........................................................................ 22
2.3.1 Early surveying techniques ................................................................ 22
2.3.2 Photogrammetry – remote sensing .................................................... 22
2.3.3 The Global Positioning System (GPS) ............................................... 23
2.3.4 Pen computers ................................................................................... 24
2.4 Position matters .............................................................................................. 24
2.4.1 Georeferencing .................................................................................. 24
2.4.2 Coordinate systems ........................................................................... 25
2.4.3 Methods of referencing data .............................................................. 25
2.4.4 The third dimension: height................................................................ 26
2.4.5 Global, regional and national systems ............................................... 27
2.5 What does GIS data look like?........................................................................ 27
2.5.1 Styles ................................................................................................. 27
2.5.2 Changing the appearance of vector data ........................................... 28
2.5.3 Changing the appearance of raster data............................................ 28
2.6 Looking at multiple layers................................................................................ 29
2.6.1 Combining layers ............................................................................... 29
2.6.2 Identifying change over time .............................................................. 30

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 3
2.7 The third dimension......................................................................................... 30
2.7.1 From 2-D to 3-D ................................................................................. 30
2.7.2 3-D GIS data ...................................................................................... 31
2.7.3 Realistic models ................................................................................. 32
2.8 Topology. It is all about relationships .............................................................. 34
2.8.1 It is all about relationships .................................................................. 34
2.8.2 Link-node topology............................................................................. 35
2.8.3 Polygon topology................................................................................ 35
3 Adding real-world information ..................................................................................... 37
3.1 Introduction ..................................................................................................... 37
3.2 The attributes of map features ........................................................................ 37
3.2.1 What are attributes?........................................................................... 37
3.3 The attributes of map features ........................................................................ 38
3.3.1 Attribute tables ................................................................................... 38
3.4 GIS can tell you everything worth knowing about anything............................. 39
3.5 Using GIS? Be selective ................................................................................. 40
3.6 Geocoding....................................................................................................... 41
3.7 Structured GIS data is the key ........................................................................ 42
3.7.1 The significance of structure .............................................................. 42
3.7.2 Address data ...................................................................................... 43
3.7.3 Road network data ............................................................................. 43
3.7.4 Spaghetti data .................................................................................... 45
3.7.5 Polygon structured data ..................................................................... 45
4 Putting it all together as a system............................................................................... 47
4.1 Introduction ..................................................................................................... 47
4.2 Unlocking the information................................................................................ 47
4.3 GIS reveals all................................................................................................. 48
4.3.1 The thematic map .............................................................................. 48
4.3.2 Visual analysis ................................................................................... 49
4.4 What happens where? – The power of the spatial query................................ 50
4.4.1 Basic spatial querying ........................................................................ 50
4.4.2 Buffers................................................................................................ 51
4.4.3 Overlay operations ............................................................................. 51
4.4.4 The tricky but important bit................................................................. 52
4.5 Show me the way to go................................................................................... 53
4.5.1 Network analysis ................................................................................ 53
4.5.2 In-car navigation................................................................................. 54
4.5.3 Drive-time analysis............................................................................. 54
4.5.4 Optimum-path analysis ...................................................................... 55
4.6 Some simple GIS examples............................................................................ 55
4.6.1 How many useful applications can GIS provide? ............................... 55
4.6.2 Flood risk............................................................................................ 56
4.6.3 Emergency services........................................................................... 56
4.6.4 Estate agents ..................................................................................... 56
4.6.5 Nature conservation ........................................................................... 57
4.6.6 Retail .................................................................................................. 57
4.6.7 3-D environmental impact analysis .................................................... 57
4.6.8 Airport-noise pollution ........................................................................ 58
5 Case studies ............................................................................................................... 59
6 Chapter 6 – Expert GIS concepts ............................................................................... 60
6.1 Introduction ..................................................................................................... 60
6.2 Data formats.................................................................................................... 60

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 4
6.2.1 GI data compared to other data ......................................................... 60
6.2.2 Proprietary file types .......................................................................... 61
6.2.3 Translators and transfer formats ........................................................ 62
6.2.4 Open formats ..................................................................................... 64
6.3 Standards........................................................................................................ 65
6.3.1 Metadata ............................................................................................ 65
6.3.2 Gazetteers.......................................................................................... 66
6.3.3 Gazetteers in England and Wales...................................................... 67
6.3.4 XML and GML .................................................................................... 67
6.3.5 Future standards ................................................................................ 68
6.4 Spatial databases............................................................................................ 69
6.4.1 Database fundamentals ..................................................................... 69
6.4.2 Relational databases.......................................................................... 70
6.4.3 Object databases ............................................................................... 72
6.4.4 GIS and databases ............................................................................ 73
6.4.5 Advanced database technology ......................................................... 74
6.5 Derived mapping ............................................................................................. 75
6.5.1 Generalisation .................................................................................... 75
6.5.2 Text placement................................................................................... 76
6.5.3 Automated cartography...................................................................... 77
6.5.4 Data from imagery.............................................................................. 78
6.6 Web GIS.......................................................................................................... 80
6.6.1 Simple maps in web pages ................................................................ 80
6.6.2 Internet mapping sites........................................................................ 81
6.6.3 Internet GIS software ......................................................................... 81
6.6.4 Web GIS futures................................................................................. 82
6.7 Mobile GIS ...................................................................................................... 84
6.7.1 Positioning.......................................................................................... 84
6.7.2 Location-based services (LBS) .......................................................... 85
6.7.3 Personal and vehicle navigation ........................................................ 86
6.7.4 LBS for the mass market.................................................................... 87
6.7.5 Telematics.......................................................................................... 88

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 5
The GIS files
v2.0 Mar 2003 © Crown copyright
Page 6
1 Getting to grips with GIS
1.1 Introduction

In this, the first chapter of The GIS files, you will find out about the fundamental building
blocks of geographical information systems (GIS), how information that has traditionally been
shown on maps can be converted into computerised form.

In the 1930s and 40s geographical analysis was conducted by overlaying different types of
maps of the same area. Since the 1950s systems have evolved to convert this mapping into
digital form and more recently to use this data for analysis and problem solving. Nowadays
GIS is everywhere; you may even have some GIS software on your PC without even
knowing it is there!

1.2 In the beginning.... there were maps

1.2.1 Map types

The story of GIS begins in the world of maps. A map is a simplified visual representation of
real things from the real world.

Maps can model the world in more than one way:

A Topographic map shows the physical surface features, for example, roads, rivers,
buildings.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 7
A Contour map shows lines which connect point locations at which a certain property has
the same value, for example, height above sea level, isobars showing air pressure.

A Choropleth map shows areas characterised by some general common feature, for
example, political maps, agricultural crop types.

1.2.2 Map features

The earliest GIS programmes were developed simply to allow map information to be stored
in computerised form. This made maps easier to store, reproduce and update. The process
of capturing map information in digital form begins with the classification of all features.

Look at any map – the different shapes and symbols are used to illustrate features. There
are four main types of symbol used to depict the different feature types. In fact, all map
features can be divided into one of four different categories:

• Point (for example, a cross symbol to represent a


church).
• Line (for example, a yellow line to represent a road).
• Polygon shape or area (for example, a blue area to
represent a lake).
• Text (for example, the name of a building).

The map is actually a very sophisticated information source. The human eye is able to
interpret a rich amount of information from a map simply from the pictorial content. This is
enhanced by the use of textual annotations (names of objects are written on a map in such a
way that the letters do not get in the way of the geographical features themselves). GIS
works by taking all of this information and recording it in electronic form.

1.2.3 Map information

How is map information translated into digital form and read by a computer?

The GIS must be able to store information about:


• The geometry: the shape and location of the objects.
• The attributes: the descriptive information known about the objects, normally displayed
on a map through symbology and annotation.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 8
There are two fundamental methods of storing this map information in digital form, raster and
vector. These are covered in section 1.3 Introducing raster and vector so hit Next to move
on.

1.3 Introducing raster and vector

1.3.1 Maps in bits

The first step in converting map information into a form that can be read by a computer is to
describe the shapes and locations of features using a series of numbers. Computers store
information in sequences of binary digits (bits), which form a code for every possible number
or letter.

This fits with the way maps reference geographical locations on the earth’s surface, through
a system of coordinates. These coordinate systems can be local, national or international.
Look at an Ordnance Survey map and you will notice, along the sides, there are a series of
numbers associated with a grid covering the whole map area.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 9
These numbers refer to coordinates from the British National Grid. All locations and shapes
can be defined in terms of x and y coordinates from a given grid system: it is these numerical
values which are used to translate map information into digital form. This applies in both
vector and raster formats.

1.3.2 Vector data

In vector data the features are recorded one by one, with shape being defined by the
numerical values of the pairs of xy coordinates.

A point is defined by a single pair of coordinate values.

A line is defined by a sequence of coordinate pairs defining the points through which the line
is drawn.

An area is defined in a similar way, only with the first and last points joined to make a
complete enclosure.

Vector data can be thought of as a list of values.

In the example above the map represents a building as a simple red rectangle. In vector data
the position and shape of the building is captured as a series of four pairs of numerical
coordinates. To reproduce the building in a GIS the computer reads these values and draws
a line linking the coordinate positions.

The vector version can also store additional context information about these features – the
attributes – a very important aspect, which will be explained in later chapters of the GIS files.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 10
1.3.3 Raster data

In raster data the entire area of the map is subdivided into a grid of tiny cells. A value is
stored in each of these cells to represent the nature of whatever is present at the
corresponding location on the ground. Raster data can be thought of as a matrix of values.

The major use of raster data involves storing map information as digital images, in which the
cell values relate to the pixel colours of the image. In the example above the data records the
colour of the feature which occupies that part of the map surface; the values recorded in the
cells are either white, blue or red. To reproduce the image the computer reads each of these
cell values one by one and applies them to the pixels on the screen.

1.3.4 Vector v Raster

Both types of data are very useful, but there are important differences – the characteristics
below are broad generalisations which do not necessarily apply in all circumstances.

Vector relatively low data volume faster Raster relatively high data volume slower
display can also store attributes less display has no attribute information more
pleasing to the eye does not dictate how pleasing to the eye inherently stores how
features should look in the GIS. features should look in the GIS.

Above is a simple comparison between vector and raster. What do you notice about the
images as you zoom in and out?

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 11
In this example the raster data looks nicer but, as you zoom in, the pixel structure becomes
obvious. Eventually the image looks like a piece of modern art rather than a detail of a map!
The definition of the features is dependent upon the size of the individual grid cells – the
resolution.

The vector data is more like a graph with a line drawn between points, the width staying the
same however close you zoom.

1.3.5 Raster can be intelligent

In general terms, vector data is more valuable to GIS systems because it can store a large
amount of information about the features. Raster data can store the colour values that make
up the whole image of an area and can be used in GIS as unintelligent backdrop mapping,
like the map image on the previous page, or data from other sources like aerial photographs.

However, in some more specialist applications, raster data can be used to do more than just
capture a visual image. The idea of storing a matrix of values across an area is particularly
suited to recording measurements of a continuous nature.

For example, archaeologists will often scan an archaeological site with sensors or probes to
get a grid of magnetic or electrical readings that may reveal patterns suggesting the
presence of structures under the soil (see example above).

This is just one example of a scientific use of raster data, and in fact, such images can also
be loaded into a GIS for analysis alongside map information.

The next section, GIS = software + data, looks at how people bring data together with
computer software to build useful systems.

1.4 GIS = software + data.... a formula for success

1.4.1 Building blocks

Any successful example of GIS is based on two fundamental components:


• the map data; and
• the computer software to perform calculations and analysis.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 12
There are many different organisations producing data for use in GIS; Ordnance Survey is
just one of these. There is also a large industry in GIS software with hundreds of companies
producing thousands of products. To find out more about some of these companies visit the
Licensed Partner area.

To be a truly effective system, a GIS needs good software and good data. One without the
other will not be productive. The other vital component is people. The GIS will only provide
useful answers to problems if the user is able to ask the right questions and can interpret the
results.

A GIS can be a simple desktop software package costing a few hundred pounds, running on
a standalone PC. Alternatively a single system can involve a very large network of
workstations and servers with many different software components costing millions of
pounds.

1.4.2 GIS evolution

GIS packages have evolved from a combination of two well established types of software:
the way in which map geometry is handled is based on graphics and computer-aided-design
(CAD) technology; the way in which attribute information is handled has been developed
from conventional spreadsheet and database technology.

A professional GIS user must be able to understand the disciplines of both these types of
software, as well as appreciating geographic principles.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 13
One of the historical drawbacks to GIS has been the high cost involved. Nowadays reduced
costs are making GIS more accessible. The Internet is also playing a big part in increasing
the extent to which GIS technology is being utilised. Many web sites use some underlying
GIS processing to present customised map images to your browser.

Section 1.5 Different types of GIS data, looks at the different types of information which can
be used.

1.5 Different types of GIS data

1.5.1 Data sources

The most common form of GIS data is based on topographic features, that is, the features
that make up the physical structure of the land surface. Topography includes the relief of an
area (the shape of its surface) and the position of both natural and man-made features.

In addition to topographical data, there are more diverse sources of information that can be
linked into a GIS.

Large amounts of data relating to both people and the environment can be viewed and
analysed in a GIS. The image above shows how a layer of environmental information can be
overlaid on a map backdrop.

Even aerial and satellite imagery can be incorporated into a GIS and
viewed along with other data for the same area, as long as the
ground extent of the image can be identified. The most powerful GIS
applications use data taken from a range of different sources.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 14
1.5.2 Ordnance Survey data and GIS

Ordnance Survey produces many different GIS data products, the diversity of these products
is itself an indication of the many different ways in which GIS can be used. They range from
simple raster images of road atlas style mapping, to very detailed vector data extracted from
the National Topographic Database, which is Great Britain’s official archive of large-scale
mapping. This database is incredibly detailed – it shows every house, every fence and every
stream in every single part of Great Britain.

Zoom through the samples below to get an idea of the range of different types of product
available for GIS use.

MiniScale™ Boundary-Line™ Strategi®

1:250 000 Scale Colour Meridian™ OSCAR® Route-Manager


Raster

1:50 000 Scale Colour Raster Land-Line® OS MasterMap™

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 15
1.5.3 Linking data

Here you can see an example of ADDRESS-POINT™ data. This is far removed from the map
model of depicting information, this illustration is meant to demonstrate that not all data in a
GIS will look like a map. At first glance the sea of dots may not appear particularly useful:
ADDRESS-POINT is not a cartographic product, but is designed to be used in conjunction
with other layers of information within a GIS application. Use the numbered buttons to see
how this type of data becomes useful.

ADDRESS-POINT can be used to identify specific properties against a map backdrop or to


link to other sources of associated information, like voting wards. Linking information in this
way is explained more fully in later section of the GIS files.

In the last section of this introductory chapter we look at how scale is important in GIS.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 16
1.6 The significance of scale

1.6.1 Scale basics

When looking at a paper map, probably the most important thing to bear in mind is the map
scale. This is the relationship between the dimensions on the paper to the real distance on
the ground.

If a building is 13 m long in the real world and a map depicts this length as 13 mm, the scale
is 1:1000. Multiplying the distance on the map by the scale factor gives you the real world
dimension.

In the world of GIS and computerised mapping things are more complicated. A description of
scale can lose its meaning – the scale of the image on screen can depend on the monitor
size. The image above may appear 13 mm long on some screens but not others.

1.6.2 Scale of capture

All GIS packages enable you to zoom in and out on the map data as much as you like.
Again, this means that you cannot say that the map data has a particular scale. However, all
topographic data has a scale of capture, that is, the source data was captured at a particular
scale, whether this was a paper map or an aerial photo.

It is important to understand the source scale of your data for two fundamental reasons:
• data from a particular scale should only be viewed within a certain range of magnification
for it to make sense visually; and
• combining two or more datasets together is only appropriate if they have an equivalent
scale of capture.

1.6.3 Generalisation

Very detailed mapping, showing the outline of individual objects such as walls and fences, is
known as large-scale data. The positional accuracy of features shown on this type of
mapping is very high but there is so much detail that if you zoom out the view becomes very
cluttered.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 17
The mapping that most of us recognise has been deliberately simplified. A cartographer
creates these simple, readable maps by selecting information from a larger-scale source. Not
all the detail from the source map can be shown. For example, a road atlas that attempted to
show every building in the country would become far too cluttered, so some features are
aggregated, smoothed out or omitted altogether.

The illustration below shows how large-scale data when viewed at a small scale (zoomed
out), appears cluttered, whereas small-scale data when viewed at a large scale (zoomed in),
appears very sparse.

Large-scale data Small-scale data

Sometimes it may be necessary to alter a feature’s true survey position slightly to make
space for the map symbols. Furthermore, the thick red lines of an A road are shown much
wider on the map than the actual road is on the ground. This science of small-scale map
production is known as generalisation.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 18
1.6.4 Be careful with scale

Many GIS data products are created from generalised map sources – they are very useful for
simple, quick to draw overview maps. However, you can view the data at any scale once it
has been incorporated into a GIS – as we have seen, this can lead to data no longer making
sense if you zoom in too closely. Worse still, the effects of generalisation will show up if this
data is viewed against other more large-scale mapping.

The demonstration above shows what can go wrong. The coloured lines are generalised map
data captured from a road atlas. When you zoom in, the deviation of the simplified features
from their survey position is apparent when the large-scale data becomes visible. The
generalised data is not wrong, it is just being magnified more than was ever intended.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 19
These principles may seem straightforward, but it is alarming how often the real benefits of
GIS are lost through using inappropriate combinations of data. For example, you may find an
accurate road layer being shown against a less accurate coastline, which can give the (false)
impression that the restaurant you are looking for is actually an underwater one!

Getting to grips with GIS has introduced some very fundamental principles of how map
information can be stored in GIS.

2 Geographical data
2.1 Introduction
In this chapter we go into more detail about the geographical data part of a GIS. We look at
the ways in which the geographical data (more correctly called geospatial data) can be
captured for GIS and then manipulated.

Geospatial data stores information about the location, shape and attributes of real objects.

So what? You might say; paper maps have been doing this for centuries. But it is the
capturing of this information in digital form that makes it much easier to store and reproduce.
It also enables the power of computers to be used in manipulating, updating and analysing
the information in many different ways.

Let us start by looking at how we capture data from maps, in section 2.2 Data capture from
maps.

2.2 Data capture from maps


With so many paper maps in existence, it is not surprising that a lot of geospatial data has
been created using them as a template. It is also possible to create geospatial data by taking
measurements direct from physical surveys. These days, most geographical information is
captured in digital form at the point of initial survey. But some data is still created by
converting paper maps into electronic form. The two most important methods are scanning
and digitising.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 20
2.2.1 Scanning

You are probably familiar with scanning technology


already, many home and office PCs come with a desktop
scanner. The scanner will take any printed image and
take a picture of it. By capturing the image in digital form
it can be stored on the computer and displayed on
screen. Scanning a map is a straightforward process and
generally fast, but it does not provide for the capture of
attribute information for features, such as the address of
a building. As discussed in Chapter 1, raster
data uses up a lot of disk space, so rasterisation of maps by scanning is not always the
most efficient method. However, it is very good for storing the cartographic style of the
map.

2.2.2 Digitising

Digitising requires the use of special equipment. The source map is laid flat on a table (tablet)
and an electronic cursor is passed over the features of the map. In this way, each of the
coordinate points which make up the different shapes can be identified. By clicking the cursor
when it is held over a point, digitising captures map data in vector form.

For digitising to work, the tablet must have a magnetic field embedded in the flat surface, so
as the cursor is moved around the map, its location can be identified.

Digitising can be very time consuming because every single point or vertex must be captured
individually. Ordnance Survey’s National Topographic Database currently contains more than
230 million features. You can imagine what a time consuming task it was to digitise it
originally. Fortunately, the database is maintained by surveying methods that generate digital
data directly.

When a cartographer is capturing information by digitising, it is possible to attach attribute


information to features. Often, the digitising tablet has some kind of menu of feature types.
Once a particular feature is digitised the resulting data contains information about its type
and shape.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 21
2.2.3 Vectorisation

Some very specialised computer systems are able to convert raster data to vector data by
recognising patterns in the image. For instance, it can guess that a sequence of coloured
pixels which seem to form a line across the image are showing a linear feature of some kind.
If the system knows the extent of the real position of the image, it can convert these shapes
into vector information. This vectorisation from raster data can be a fast method of capture
because it can be automated, but is usually less accurate than manual digitising.

Now we will look at more direct methods of obtaining geospatial data, in section 2.3
Surveying and remote sensing.

2.3 Surveying and remote sensing


Surveying techniques have undergone a remarkable process of evolution. Modern survey
technology is extremely complex. Here we illustrate the most significant advances.

2.3.1 Early surveying techniques


In simple terms, the job of the surveyor is to measure the size,
shape and relative location of physical objects in the outside
world. Size and distance are fairly easy – you can use physical
measuring tools of stable and constant length to record these
dimensions. Some of the earliest long measurements were
made using glass rods end to end, to fix a distance between two
points on the ground. Such rudimentary methods are still in use
today.

From the earliest days of surveying, surveyors have exploited the rules of trigonometry to
deduce distances between points on the ground without actually having to measure them
directly. Once you have accurately recorded the distance between two points, you can then
identify the distance to any third point by simply measuring the angles between all three. This
process is called triangulation and was the basis for Ordnance Survey’s original creation of
detailed mapping for the whole of Britain. The theodolite was the traditional optical tool used
to survey in this way, and more recently electrical devices were developed to conduct this
kind of ground measurement.

2.3.2 Photogrammetry – remote sensing

Photogrammetry is the science of measuring objects from photographs.


Historically, this meant using aerial photographs to capture topographic
information. The first photogrammetric surveys were conducted more
than 100 years ago.

Now satellite pictures are also used to record the location and
geometry of features on the ground. Remote sensing is another term
describing the use of aerial and space imagery to record geographical
information. It includes the interpretation of other phenomena such as
vegetation type or land use shown in the Earth’s reflectiveness to
different wavelengths of electromagnetic radiation.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 22
Initially, maps were created from aerial photographs by various kinds of tracing mechanism.
Sophisticated devices were engineered to allow an operator to view and trace a pointer
around the visible features on the photograph. Using a system of wheels and pulleys, this
motion was mechanically reproduced by a drawing arm. Such machines used stereoscopic
viewing to survey in 3-D (three dimensions). As technology has advanced, the techniques of
digitising and scanning have become important aspects of photogrammetry.
Ordnance Survey has captured a significant amount of its detailed mapping by digitising from
aerial photographs. Remote sensing by satellite is now widely used for data capture and, as
the accuracy increases, this method could replace ground survey and aerial photos.

2.3.3 The Global Positioning System (GPS)

The development of the GPS, by the United States Department of Defense, is revolutionising
the world of surveying. It enables positioning of objects on or above the earth’s surface in an
absolute sense, not just in relation to other nearby features (as in the use of photogrammetry
described previously, in which locations are defined relative to the known position of certain
features in the image).

GPS can be used almost anywhere in the world, 24 hours a day, in all weathers. A
constellation of 24 satellites orbit the earth and send signals that can be picked up by GPS
receivers. GPS measurements are taken by computing the distance between the receiver
and the satellite. If a receiver picks up signals from four or more satellites, a 3-dimensional
position can be calculated. Certain methods can be used to increase the accuracy of the
position to the 1cm level, either in real time or afterwards during post-processing.

GPS measurements are obtained in the GPS coordinate system: World Geodetic System
1984 (WGS84). Users should be aware that this position usually needs to be converted into
®
the local coordinate system for the region, OSGB36 in Great Britain, enabling GPS to be
used alongside the local mapping. GIS data collectors can make use of the free GPS service
provided to locate map objects and features directly in the field. Full details of this are
available on Ordnance Survey’s GPS web site (http://www.gps.gov.uk).

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 23
Ordnance Survey uses GPS to locate map detail. However, there
are many other uses for GPS, such as navigating boats, planes
or cars, monitoring the stability of structures, and providing
location information for recreational users.

2.3.4 Pen computers

Whatever the method used for measuring the shape and location objects, modern surveyors
rarely record information by hand drawing detail on a master survey document (see surveyor
on left of picture with large board). Instead, they use hand-held pen computers equipped with
flat, touch-sensitive screens (see surveyor on right of picture). These computers allow the
surveyor to draw and click directly onto the screen to update map information while out in the
field. Importantly, this means that new map features can be inputted directly as digital data.

Ordnance Survey surveyors use such a system known as PRISM, standing for Portable
Revision and Integrated Survey Module. It enables the coordinates of new objects to be
added in reference to the existing features. Features that have been demolished can be
deleted while text names can be added using a freehand character recognition facility. So
even if the geographical objects have been measured with the most rudimentary and
time-honoured techniques, a tape measure for instance, the information will still be recorded
in electronic form out in the field.

Now we have looked at obtaining geospatial data, we need to look at how we relate it to the
world or to other datasets in section 2.4 Position matters.

2.4 Position matters

2.4.1 Georeferencing

Is your data georeferenced?

It is important to understand this concept because when you use a GIS, you are often
combining different layers of spatial data. An over riding coordinate system is needed so that
spatial data layers can be referenced to the earth’s surface in the same way. Otherwise, if
you use different coordinate systems, there will be no way to analyse the relationship
between the data.

In the image below, two layers of data have both been georeferenced to the same coordinate
system and hence match together.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 24
What is georeferenced data?

Georeferenced data is spatial data that is referenced to a location on the earth’s surface. To
do this, common frames of reference and coordinate systems have been set up. This is a
tricky business and this section will explain why.

2.4.2 Coordinate systems

It is clear that points and features on, above and below the earth’s surface have position. To
express that position, you need to choose an appropriate coordinate system so that positions
can be defined within it. Such a system must obviously give each point a different coordinate.

There are thousands of coordinate systems in use throughout the world. It is possible for a
user to invent their own and, indeed, this is often done for large engineering works. The
disadvantage is that points outside the area of the bespoke system cannot be coordinated
within it and the relationship between the bespoke and other systems is hard to define.

2.4.3 Methods of referencing data

There are two fundamental methods of referencing spatial data:

The first, known as the geo-centric system, uses a 3-D coordinate system with the centre of
the earth acting as the origin of the three axes. This method is universally used in scientific
applications, but it is user unfriendly when applied to points on the earth’s surface. This is
because the axes are made parallel to the spin axes of the earth rather than north (or some
other arbitrary direction). The system can be expressed in two ways: either as a
3-D Cartesian coordinate of the form x,y,z; or as latitude), longitude) and height (H) above a
known reference surface – the ellipsoid (see section 2.4.4 The third dimension: height). It is
important to note that the x and y do not refer to east and west or north and south.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 25
The second, more common, system is the projection. This takes the 3-D coordinates and
expresses them as a plane plus height above it. In other words, it flattens out the curved
earth in a small region to a flat surface. Coordinates on the plane tell us where a particular
point within the projection is being used and are measured as distances, east and north from
the starting point or origin.

To avoid errors, the extent of the projection is usually limited to a small part of the earth’s
surface. Choice of projection is dependent on the area of the world being considered.

Different projections have different properties. Spatial data users choose the projection that
provides the least distortion, in distance, direction, scale, area and so on, within the region
being considered.

Bear in mind that there are dozens of different types of projection and each can have
hundreds of different definitions depending on where they are used. In Great Britain the
chosen projection is known as Ordnance Survey National Grid.

2.4.4 The third dimension: height

Height can be expressed in two ways:

The first, and by far the most obvious, is to measure heights from sea level. However, this
can be difficult as the sea level is irregular and constantly changing, making measurements
of height inland both complex and expensive.

The second was created specifically because of these drawbacks. Scientists invented a more
regular surface called an ellipsoid, which approximates to sea level. However, because the
shape of sea level is complex, hundreds of different ellipsoids are required depending on the
area of earth being modeled. Within Great Britain the ellipsoid of choice is know as Airy 1830
and this is used for the National Grid projection.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 26
2.4.5 Global, regional and national systems

Many national mapping agencies have defined local referencing systems to meet their
needs. In Britain, for example, we have the National Grid. These are perfectly acceptable
when working within a country, but problems exist in drawing up multinational data unless
there is cross-border convergence. In Europe local coordinate projections are often referred
to the European Terrestrial Reference System 1989 (ETRS89) because it is fixed at a point
in time and well defined.

Probably the most common global coordinate system in use today is the GPS-based World
Geodetic System 1984 (WGS84). This is fixed to points on the earth’s surface which move
over time because of changes in the earth’s crust.

A significant global projection system is Universal Transverse Mercator (UTM). This is a


defined set of projections that cover the whole world and allow countries to share spatial data
more easily.

Once the digital data has been georeferenced, you can display the information in an infinite
number of ways. See section 2.5 What does GIS data look like?

2.5 What does GIS data look like?

2.5.1 Styles

The very simplest advantage that GIS gives in comparison


with paper maps is that you can change the appearance of the
information to any style you like. In conventional mapping a
large amount of time and effort is spent on deciding
appropriate colours and styles for the depiction of features, to
ensure that the image is as clear and informative as possible.

The flexibility of GIS adds an extra dimension to this process: you can change the
appearance depending on exactly what message you want to convey.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 27
2.5.2 Changing the appearance of vector data

The greatest flexibility comes when using vector data; remember that all the computer stores
is a set of coordinates which make up the shape of the object. Any GIS will let you choose
the colour and style of how the features are represented on screen. These styles will then be
reflected in the printed output from the system and, ironically, many organisations use GIS
simply to create customised printed maps.

With point features you can change the symbol type and the colour, and for line features you
can change the style and colour. For area features you can change the colour of the shape
itself as well as its perimeter. The colouring of the body of the shape (the fill) can be made
solid, patterned or even transparent.

2.5.3 Changing the appearance of raster data

As explained in section 1.3 Maps in bits, the nature of raster data inherently defines how the
mapping should appear. The raster map is an image of coloured pixels, and the fact that a
road is depicted comes from a visual interpretation of adjoining pixels of the same colour, not
from any information saying this is a road in the data structure itself. The only data entities
are the pixels themselves, and the only intelligence stored about those pixels is their colour.

Having said that, it is still possible to alter the appearance of raster data in a GIS. In simple
terms you could make all red pixels appear blue, for instance. Usually this is not a good idea,
because the image was designed with the most appropriate colours in the first place.

However, in some circumstances this facility is useful because you can tone down the colour
scheme of the raster image to allow other information to be placed on top and made more
readable. The graphic below demonstrates some different ways in which the appearance of a
raster image can be adjusted.

The point made above – the fact that you can place a layer of text information on top of a
separate layer of raster data – leads us into the next section. In section 2.6 Looking at
multiple layers, we will look at how different data layers can work together.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 28
2.6 Looking at multiple layers

2.6.1 Combining layers

GIS really gets going as a powerful tool when you start to work with different layers of
information about the same geographical area at the same time. When compiling a
conventional map, the cartographer has to draw a balance between
displaying as much information as possible to make the map useful without
adding so much detail that it becomes cluttered and confusing. With GIS,
this problem is removed – many different layers of information can be
added, and shown in different combinations and in a different order,
depending on the particular message to be conveyed. Using the power and
flexibility of the computer, different data layers can be switched on and off
at the click of a mouse, so that many different views can be created for the
same location.

The image below represents the way in which a GIS can display many different layers of
information at the same time. Using different combinations, the display can serve a wide
range of purposes that could only otherwise be achieved by producing a whole set of
different paper maps.

Referring to the graphic: step 1 creates a communication network map by switching on just
the towns, roads and railways; step 2 generates a view of the relief of the area by switching
on the contours and rivers; and step 3 gives a view of all these layers together which can
help to analyse the spread of urbanisation in relation to networks and relief.

1) Communication 2) Relief map 3) Urban growth


network map

Vector data can also be shown in combination with raster data, the latter usually in the form
of a backdrop.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 29
2.6.2 Identifying change over time

Another benefit of mixing and matching different layers is that by combining mapping for the
same area surveyed at different times, you can identify any changes. The example below
shows the changes that have occurred as the coast has been significantly eroded over time.
The green line is the current position of the cliff.

These are just simple examples of how a different message can be portrayed within a GIS by
showing a mixture of feature layers. The most sophisticated GIS users are likely to be
working with hundreds of layers, enabling them to create any kind of map display for a
particular geographical location.

Now things start to get more exciting – GIS not only revolutionises the usefulness of map
information, by allowing it to be shown in many different combinations, it also takes us
beyond the realm of the flat, planar view of the landscape. The next section looks at 3-D
mapping using GIS: section 2.7 The third dimension.

2.7 The third dimension

2.7.1 From 2-D to 3-D

To understand a two-dimensional (2-D) representation of the real landscape you need a level
of interpretation and imagination. The physical world exists in three dimensions and, unless
you ignore those extruded plastic maps of the world with snow-capped lumps showing the
main mountain ranges, the realm of conventional maps is uncompromisingly flat. The
capability of GIS to produce dynamic and attractive three-dimensional (3-D) maps is one of
its most exciting benefits.

Map makers use a range of visual symbols to show height information and create the illusion
of an undulating surface:
• Contours
• Spot height symbols
• Hill shading
• Cliff and slope symbols
• Viewpoint symbols

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 30
2.7.2 3-D GIS data

Height information can be captured in a GIS in exactly the same way as the shape and
location of objects. The spectacular ability of today’s computers to perform calculations
means that 3-D models of the ground surface can be constructed from data recording the
height at different points across an area.

The typical way this information is stored is an extension of the conventional grid coordinate
system: as well as recording the latitude (the x axis) and longitude (the y axis) for a given
point, the elevation (the z axis – usually as height above sea level in metres) is also stored.
Thus the height information for an area is often referred to in terms of z values. The
fluctuations in ground height across an area are a continuous phenomenon, every point on
the ground has a z value irrespective of whatever physical features are present.

Point height information can be collected by surveyors out in the field, or more commonly by
using remote sensing, including photogrammetry (section 2.3). Points of the same height can
be joined to form a line or contour.

Once created, most 3-D GIS data is stored as a grid of points, with x, y and z values stored
as attributes, often referred to as a digital terrain model (DTM) or digital elevation model
(DEM). From this grid a computer can build a 3-D model.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 31
An alternative method of representing a surface is to create a triangulated irregular network
(TIN). A TIN model forms a continuous surface by connecting irregularly spaced spot heights
to form triangles, keeping a flat surface within each triangle.

2.7.3 Realistic models

These 3-D models can be made to look very realistic by applying colour to the surfaces. It is
even possible to drape raster images of maps or aerial photos over the surface with quite
stunning effect. Furthermore, if the heights of physical objects like buildings, forests and
electricity pylons are known, these can also be built into the 3-D model. Hence it is possible
to create computer models of entire towns and villages which relate directly to the real world.

The ability of GIS to operate in three dimensions has many useful applications, for example:
• visualisation of the 3-D landscape;
• calculation of gradients for roads and railways;
• environmental impact analysis for engineering projects;
• screening of objects such as power stations and wind turbines through line of sight
analysis;
• radio wave propagation analysis – important to mobile communication networks;
• flood risk analysis;
• town planning; and

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 32
• leisure products – many computer games use realistic landscapes based on GIS height
data.

Below is a selection of thumbnail images showing different types of 3-D view captured from
GIS, have a look at them to see just how effective these visualisations can be. Try
downloading the animations for a glimpse of how GIS can build entire virtual worlds.

Mobil 1 Rally Championship computer game using Ordnance Survey data

Ben Nevis fly-through Snowdon fly-through Scafell fly-through

And now for something completely different. Topology is one of the most revered examples
of jargon in the whole subject of GIS. The next section demystifies this term, which is actually
rather important: section 2.8 Topology. It’s all about relationships.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 33
2.8 Topology. It is all about relationships

2.8.1 It is all about relationships

While the term topography describes the precise physical location and shape of geographical
objects, the term topology is more concerned with the logical relationships between the
position of those objects. For example, in a topographic map of Hyde Park, London, you
would show an accurate depiction of the shape of the park and a precise alignment of the
shape of the objects within it – Serpentine lake, for instance.

In a topological map the precise shape of the objects is not important – there will be a shape
called Hyde Park and a shape called Serpentine lake, but most importantly the Serpentine
lake object will be entirely contained inside the Hyde Park object.

It is the knowledge of this spatial relationship which is key. This may seem a dry and obscure
point, but topology is critical to understand how the computer is able to analyse the
relationships between objects. If the topology of a set of data is wrong then the GIS cannot
analyse how objects relate to each other: are they next to each other? Do they overlap? Do
they form a connection? Does one lie completely within another?

Geospatial data will have topology inherited from the source material. Hence, when you
digitise a map, the topology, which is implicit in the visual interpretation of the map, is built
into the data. However, care is needed. Unless the data is topologically correct the computer
will not necessarily pick up the relationships.

One of the commonest errors when digitising data occurs when there is a slight inaccuracy in
the start or end point of a line. This can result in the linework not being correctly joined up.
The line can form an undershoot or an overshoot (see diagram below).

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 34
Although these errors can be difficult to detect by the human eye, they prevent the GIS from
understanding the fact that these two features are actually joined to each other.

The two most important aspects of topology in GIS are link-node topology and polygon
topology.

2.8.2 Link-node topology

A GIS does more than just show the positions of objects on a computer screen, it can also be
used to model real-world events. One of the most important examples of this is the ability to
model networks. There are many networks in geographical data, such as water courses and
street maps. A GIS can analyse the potential flow around these networks, a useful ability in
flood analysis or route finding. It can only do this if the data has correct network topology –
the joining of the lines at exactly the same point in the data. Lines in a GIS network are
usually called links, the points which define the shape of the link are called vertices, and the
points at which they join are called nodes.

We will learn later in chapter 4 how link-node topology can be used in network analysis.

2.8.3 Polygon topology

Area features are defined in a GIS by the linear shape of the perimeter and some kind of
reference point indicating that the space enclosed by those lines relates to a geographical
feature. This reference point is referred to as a centroid, a seed or a label point.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 35
Geospatial data is often captured in the form of linework showing the extent of physical
features on the ground, like fences, roads, and rivers. The area features then have to be
identified by assigning seed points to each bit of space in the resulting map data. For
example, a wall feature may, at one point of its length, define the perimeter of a school
playground, but further along form the edge of someone’s front garden. It is the seeds that
store the information about which links make up the edge of an area feature and what it is. It
is very important to avoid undershoots in the data, otherwise the system cannot tell whether
an area is closed at a particular point.

There will be much more about the structuring of data in Chapters 3 and 4.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 36
3 Adding real-world information
3.1 Introduction

The previous section talked about how the geospatial element of map data can be
incorporated and viewed in GIS. However, to really uncover the power of GIS, we need to
look at what happens to the attribute data.

This information tells you not just the shape of a feature but what it is, and any other possible
piece of information that may exist about it. A map may show that a feature is a river by
depicting it as a blue line; it may also record its name using a blue text label. In a GIS this
information may be stored within the data itself.

There are numerous possibilities for exploiting this attribute information. Any information that
relates to a place on the ground can be loaded into a GIS and analysed. When you consider
the amount of detail contained in GIS data, this can often raise the question – Is Big Brother
watching you? Well, yes actually, he probably is. We are now well and truly living in an
information age and there is no escaping the fact that some of this information is about
ourselves. The difference with GIS is that if Big Brother is watching us, he is less likely to be
looking at us in the wrong place.

This section looks at the different ways in which the information attached to GIS objects can
be interrogated and exploited.

3.2 The attributes of map features

3.2.1 What are attributes?

The maps shown in GIS are intelligent – the features know their own identity.

How?

In chapters 1 and 2 the ways in which geographical information can be loaded into a
computer and displayed in a GIS have been described. We now move on to show how this
information can be used. The term attribute describes any piece of information about an
object that can be stored in addition to its geographic properties. For instance, a road may
have a number, a name, a maximum width, a speed limit and so on. GIS can work with this
descriptive attribute information to create intelligence way in advance of what can be
achieved by placing text on a paper map.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 37
With GIS you are no longer restricted by how many text descriptions can be fitted into the
available space to convey information about the objects in an area. Tabular information can
be stored about each of the objects just as in a database, so allowing an almost infinite array
of attributes to be recorded. All GIS have simple tools that allow the interrogation of the
features. Hence, by using your mouse to click on an object, a full set of attributes can be
displayed without that information having to be on screen all the time. The object of interest
can be identified by the visual map graphic and then that object can tell you its own
attributes. Which is very clever stuff.

Move your mouse over this graphic to see the attributes that are associated with the house
and the road.

3.3 The attributes of map features

3.3.1 Attribute tables

Most GIS enable the user to view the data in tabular form without necessarily using map
graphics at all. This is equivalent to using typical office spreadsheet software. Often you may
know the name of an object but not necessarily where it is, hence, you can use the table to
find the object and then switch to the map to see where it is.
• Move your mouse over the table below; notice how it also locates the object on the map.
• Move your mouse over the map below; notice how it finds the object’s attributes in the
table.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 38
The GIS forms a constant link between the attributes and the geographical properties of each
of the features: you can get either one of these if you know something about the other. This
is the basis of location-finding mapping services on the Internet: you can generate a map for
any location because there is a data layer with a link between the postcode attribute and the
geographical coordinates. You can see how this works on the Ordnance Survey Get-a-map™
pages, but please press the back button to continue with The GIS files.

GIS can be used to link to any piece of information that may exist about an object from other
systems. As we will see in section 3.4 GIS can tell you everything worth knowing about
anything.

3.4 GIS can tell you everything worth knowing about anything
Once a feature is loaded into a GIS, any piece of information about that object can be linked
to it. How does this work?

When you start with geospatial data you often only have attributes that could be determined
from the original source material. Information gleaned from the original map might, for
example, show a line feature as an A road, numbered A11. However, the GIS can be used to
link to any piece of information that may exist about the object from other systems. This can
often lead to very powerful applications of GIS.

Any organisation which holds information about geographical objects can load that
information into a GIS as long as they have some map data containing the relevant objects.
Therefore it is not just the attributes that come with the geographical data that can be
interrogated but any other item of information known about the object.

For this to work it is necessary to have some kind of common referencing system so that the
correct record in the geospatial data can be matched with the corresponding record in the
non-geospatial data.

Use the graphic below to run through this example.

1 This shows you the Ordnance Survey


data about this river and a map
showing its location.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 39
2 This shows the Ordnance Survey
data and the environmental data –
notice the different information stored
in each table and the common
reference.

3 This shows all the information joined


together and its location on the map

This process has allowed us to link other attributes (environmental information) to the map.

This kind of application is very much dependent on the ability to establish links between the
entities in the two sets of information. Often it is better to use a numerical referencing system
understood by all users of a particular type of information, so that the specific features can be
identified unambiguously. If you just use text names this can fall down if one set of
information has a misspelt name or if there are duplicate entries. There are, for example,
many stretches of river in Britain with the name attribute River Avon.

Ordnance Survey has developed its own common reference system using millions of
Topographic Identifiers (TOIDS). These are unique 16-digit numbers applying to every
feature in its large-scale database. TOIDs will make it a lot easier for users to link, combine
or transfer information quickly and efficiently. This system is part of a massive project known
as the Digital National Framework™ (DNF™). We will cover the DNF in later chapters.

Once the GIS is populated with feature attributes, the layers can be analysed in many
different ways using queries and selections. We will now look at that in section 3.5 Using
GIS? Be selective.

3.5 Using GIS? Be selective


You can query the features in GIS map layers by selecting and viewing just those that satisfy
particular criteria. How useful is that?

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 40
Performing selections on information held in spreadsheets and databases is the classical
way in which computer users make sense of large volumes of data and provide answers to
specific problems. Within GIS it is possible to do just the same, only with the added
advantage that the results of those queries are displayed in a geographical context. So, not
only can you identify which records satisfy a particular set of criteria, you can also see where
they are in relation to each other.

Data stored in tables is usually very difficult to digest all at once. It is necessary to filter out
relevant sets of information corresponding to a particular group of conditions. For example,
imagine you are visiting a city with which you are unfamiliar. You might have a map showing
the city centre and the location of all the restaurants. With GIS range of that fit an

You can make this type of query as simple or as complicated as you like, as long as the data
fields are there to interrogate. This ability is not unique to GIS software; many different types
of information system will allow you to perform selections. However, only GIS can provide a
visual representation of the location of the query results. Furthermore, GIS can apply
geographical criteria to the selection filter such that objects are selected based on where
they are. We will cover this in more detail in Chapter 4.

The next section, 3.6 Geocoding builds on the idea of integrating tabular data by explaining
the term geocoding, another favourite example of GIS jargon.

3.6 Geocoding
This is one of the key functions of GIS, but what does the term geocoding actually mean?

In a way, the concept of geocoding is very similar to the idea of linking to external datasets,
for which the river data example was used in section 3.4. Geocoding describes another way
of importing non-map data into the GIS such that its geographic properties can be identified
and the records positioned in space. However, unlike the linking method in which the
additional attribute table remains external to the mapped layer, when a table is geocoded it
becomes a new map layer in its own right. Coordinate points are assigned to the geocoded
table so that it can be used on its own to display the locations of the objects concerned.

OK, we may have lost you, so… Geocoding is easier to explain with a worked example:

It usually takes place with a list of locations with known addresses. Imagine you have a
simple table of British football clubs containing the name of the club and the postcode. To
geocode this list you have to process each record against the postcode data in the GIS.
There are already several GIS data products that store the National Grid coordinates for
every postcode in the country. The National Grid coordinates from the postcode product get
copied across to join the football club list. This creates a new football club layer that can be
added to the GIS.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 41
Geocoding is often applied to address lists. Not many people know the National Grid
coordinates of where they live so any list of people’s addresses needs to be geocoded to
load it into a GIS. Any company, a bank for example, which holds an address list of
customers can geocode this information and instantly analyse their geographic distribution.
This may reveal trends that the bank would otherwise be unaware of, areas of particularly
high or low customer density, which may suggest reasons for the successful recruitment of
customers. We will look at how organisations are using GIS to improve their decision making
in greater detail in the next chapter.

Finally, in the last section of this chapter, we look at the way that geospatial data is
structured and can affect the amount of information which can be analysed in a GIS:
section 3.6 Structured GIS data is the key.

3.7 Structured GIS data is the key

3.7.1 The significance of structure

For geospatial data to be really useful to GIS applications it must be structured to model the
real world. What is the significance of structured data?

In the last few pages we have seen that there are a number of ways of getting general types
of information about objects into a GIS. This always relies on the presence of a layer storing
the location of a set of objects along with some attribute like a name or an ID number.

Our examples so far show that:


• to link environmental information about rivers we needed a river map layer possessing
names as attributes; and
• to geocode a set of addresses we needed a geographic layer with the coordinates of
postcode locations together with the postcode text itself.

In these examples the GIS must hold location and attribute data that corresponds to the
physical objects which people want to analyse. This means that the structure of the physical
object data in the GIS data must be attuned to the types of application that the system is
meant for.

Do not worry, this will all become clearer as we move through this chapter and the next. The
idea is that data has to be structured in a certain way to carry out certain types of spatial
analysis.

We will now look at Ordnance Survey point data that is often used for geocoding and
Ordnance Survey road network data that is often used for analysis of networks.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 42
3.7.2 Address data

Ordnance Survey produces data products that have their origins in the accurate large-scale
archive but which are tailored for use in GIS. The simplest is ADDRESS-POINT®, whereby
the building seeds from Land-Line® have been matched with the Royal Mail Postal Address
File (the UK authority on addresses). This has created a basic point dataset where the
location is stored as a pair of National Grid coordinates along with the address attributes, like
house number, street name and postcode. This data is as far removed from a conventional
paper map as can be imagined.

On its own, ADDRESS-POINT has no capacity for visual interpretation as a map. It is purely
a resource to enable address-based information sources to be loaded into a GIS for analysis

3.7.3 Road network data

One of the major uses of GIS is in the modelling of transport networks. To get a really
comprehensive model of a road network we have to look to large-scale sources for the most
accurate possible information. However, the data attributes that are captured in the source
mapping are the physical objects such as the roadside kerbs and the fences bordering the
pavement. You can see that it is a road but there is nothing necessarily in the data to define
discrete chunks of the road network. This is why many mapping agencies have captured
road centrelines in addition to the physical features shown on paper plans, as shown in the
graphic below.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 43
Ordnance Survey has generated a set of road centrelines showing the complete road
network from the accurate information in the large-scale data, the resulting family of products
is referred to as OSCAR® (Ordnance Survey Centre Alignment of Roads) data. Again, in
isolation this data does not appear helpful in a visual sense. However, with its link and node
topology (see Chapter 2) and attributes about road names and numbers, OSCAR is the basis
of many powerful GIS applications.

1 OSCAR data

2) OSCAR displayed over 1:10 000 data backdrop

3 OSCAR data using colour to show traffic volumes

The third graphic shows how traffic volumes may vary on a road network: the red roads carry
the highest volume and the green the lowest. The ability to analyse a network like this is
useful for finding problem areas and suggesting alternative routes.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 44
3.7.4 Spaghetti data

Large-scale topographic data is normally captured by the survey of physical objects on the
ground. Before the advent of GIS the need for structured data was not understood, so data
was captured quite randomly. Often a line along a connected series of objects was captured
as a single feature, instead of a separate line for each object or house. This is especially so
in housing estates, where long fences or walls surrounding a series of properties were
digitised in one go. When the data is drawn on screen it looks fine; you can see a row of
houses, each with its own front and back garden. However, a look at the attribute table
shows that the data does not consistently correspond to discrete real-world objects like the
wall of a single house or the house itself.

This kind of data is often referred to as spaghetti data (as unstructured and random as
spaghetti thrown onto a plate). The lines are a tangled mixture that can be interpreted
visually as a large-scale map, but they do not explicitly store each separate datum that the
GIS could potentially analyse. Put simply, even though it looks good, it does not make much
sense and it is not very useful for analysis. There is much more need to analyse information
about houses, properties and land parcels than there is about the linear features which
enshrine them; only a creosote manufacturer is likely to be interested in analysing the
attributes of fences.

Now lets look at how improving this data structure provides significant benefits to GIS users.

3.7.5 Polygon structured data

It is possible to create large-scale data that enables the identification of every single discrete
parcel of land from spaghetti data. Many different GIS programs are able to automatically
convert spaghetti data to polygon structured data by identifying every bit of space between
linework. This means that the data is much more useful as it represents the real world in a
much more realistic way. Again the data looks good, but now it makes much more sense and
can be used easily straight off the peg for complex analysis.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 45
Move the cursor over the map to see how you can pull up information about defined areas.
You can not do this with spaghetti data!

Having converted all its paper maps to digital form, an almost equally daunting programme of
work is now under way at Ordnance Survey to convert the point and line digital data into
polygon structured data. The result of this re-engineering process will be known as
OS MasterMap™ developed from the Digital National Framework™ (DNF™).

The decision to create OS MasterMap is the true measure of the impact of GIS on society.
The needs of the country for map information are much more oriented to use in GIS (needs
to be structured) than towards traditional visual renderings (does not need to be structured).
OS MasterMap will encourage many more potential GIS applications to become possible, as
users will be able to load information stored about actual objects or parcels of land into GIS
in a consistent and comprehensive way.

We will be covering OS MasterMap in later chapters. In the meantime, we have a useful


demo on our OS MasterMap page that will help explain the benefits of re-engineering our
data.

Because so many companies and individuals use different types of geographical data, there
is a need for consistency in storing attributes. This consistency means that a full range of
information about objects can be brought together in a GIS and it is easier to share
information. This is important, so a lot of effort goes towards defining standards for data
structures. There are British, European and global standards bodies working very hard to
make sure that all data providers and software vendors are working with data structured in a
common way, so that the different datasets can be linked together. There are much more
about data standards in the Chapter 6, section 6.3 Standards.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 46
4 Putting it all together as a system
4.1 Introduction

With the different features described in the previous chapters, it is clear that GIS are very
flexible and sophisticated tools. Not only can they understand the location element of map
data and manipulate information about shapes and structures, they can also work with
attribute information to store intelligence about objects. It is the fusion of these two functions
that makes this tool so powerful.

This chapter looks at the different ways in which GIS can provide practical solutions.

4.2 Unlocking the information


All organisations have information locked away in various databases – GIS can help uncover
the full value of this information.

Approximately 80% of all information held in databases anywhere in the world contains some
kind of geographic element. For example, records in a database can be tied to a particular
location on the ground, such as an address, building, property or road junction. There are
many trends and relationships hidden in this geographic data, but it is only by using a GIS
that these are revealed.

Many different organisations use GIS as a central part of their activities, and the range of
applications in use is extraordinary.

For example:
• utilities – leak management, service planning, network planning;
• central government – census, environmental planning, health service catchment areas;
• local government – refuse collection, street lighting, council tax collection;
• emergency services – crime locations, route finding;
• military – battlefield simulations;
• retail – travel time catchment areas, store site location;
• financial – insurance flood risk, property values; and
• target marketing – demographic profiles.

One of the best ways to analyse data is to produce colour-coded maps that reveal patterns in
data which may otherwise be missed: this is explained in section 4.3 GIS reveals all.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 47
4.3 GIS reveals all

4.3.1 The thematic map

By bringing together data from a wide range of different sources, you can visualise trends in
the data by creating thematic maps. How does this work?

We discussed in previous chapters how a GIS display of map information is very flexible.
Unlike a paper map it does not require every piece of information to be visible at the same
time. It can also change the depiction of a particular object depending on the value of one of
its attributes. This function is known as thematic mapping. You will already be familiar with
thematic maps from atlases and geography textbooks. For example, a map of parliamentary
constituencies shaded in different colours can show the number of seats held by different
political parties. GIS can build this kind of map automatically from the data values (number of
seats), and in many different ways.

Thematic maps come in all shapes and sizes, for example:


• a map of farmer’s fields showing a
different colour for each type of crop
grown;

• different sized point symbols to show the


relative population of towns;

• a display of a road network with different


colours to show average traffic speeds;
and

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 48
• a density map to show average numbers
of badgers across different counties.

These examples are typical of the types of thematic mapping for which GIS is used. The
really dynamic thing about these maps is that they can automatically change their
appearance as the values in the data tables change with time. Hence, you can use this kind
of mapping to constantly monitor traffic flow.

4.3.2 Visual analysis

The graphic below shows a range of different thematic map layers. There are basically two
different layers here. The first layer displays the relative turnover of a set of burger bars
across a city (the blue dots) and the second layer is background mapping. This second layer
can be switched between a series of data layers showing a different set of attribute
information for the same area. Try swapping the layers around and see if you can spot a
correlation between the background layer and the burger bar layer. Which layer shows a
pattern that fits with the more successful burger bars? Why might these layers be related?

Have a look…

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 49
This example shows how GIS can be used to visually analyse information and help to explain
spatial patterns.

However, you do not have to rely on a visual interpretation of trends in data - the GIS can do
this for you. This is explained in section 4.3 What happens where? – The power of the spatial
query.

4.4 What happens where? – The power of the spatial query

4.4.1 Basic spatial querying

GIS can not only tell you what information exists about particular features in the map data but
it can also analyse where things are in relation to each other. What can this tell us?

GIS can go beyond visual analysis of thematic mapping as described in the previous section.
The software can identify trends across a given area as well as performing specific queries.
Such queries can select attribute data depending on its geographical location and then
interrogate the attributes by performing calculations and statistical analysis. Selecting data
based on the geometry of objects is known as performing a spatial query.

The simplest spatial query can be performed on screen using the selection tools that are
provided with the GIS software. For instance, you can draw a circle on screen and select all
objects falling inside it. This example shows addresses that have been selected because
they fall within the circle.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 50
This technique could be used by the emergency services to quickly identify all houses within
500 m of a spillage of a dangerous chemical.

4.4.2 Buffers

It is possible to analyse how close objects are to one another using a buffer. A buffer is a
shape based on any other existing object (point, line or area) that can be generated by the
GIS. The buffer object represents the total area within a certain distance of a given feature.

Example of a point buffer

Example of a line buffer

Example of area buffer

You can use the GIS to generate buffer zones and then identify all features that lie within a
particular distance. For instance, you can select all addresses within a 500 m buffer of a busy
road and compare these with data about the incidence of asthma. By comparing both sets of
data you can work out if there are statistically more asthma sufferers living in the buffer
region than in the general population. This allows you to analyse whether proximity to a busy
road is likely to be a factor in the cause of asthma.

4.4.3 Overlay operations

As well as drawing simple shapes and calculating buffers to select objects, it is possible to
place layers on top of each other (remember we looked at combining layers in Chapter 2,
section 2.6 Looking at multiple layers) and select all objects from one layer that lie within an
object from another layer.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 51
4.4.4 The tricky but important bit...

A key advantage of being able to layer data in a GIS is to carry out overlay operations. These
can be quite complex, but simply mean combining layers of data to create one new layer
(similar to the way a mathematical sum or calculation creates a new value or answer).

In the example below a farmer needs a certain level of rainfall and a type of soil to
successfully grow a crop. By combining the rainfall map and the soil type map it is much
easier to find the best location. In this example the GIS assigns a numeric value to each soil
type and to the amount of rainfall. This makes it easier, on the resulting map, to see where
the optimum growing area is located.

The point is that by combining layers of information, the farmer creates a new map that is
much more useful.

Overlay operations are particularly effective when using raster data. As discussed in
Chapter 1, raster data is good for representing the continuous varied surface of the earth,
whereas vector data makes assumptions that the edge of an area has a defined boundary.
An example of a vector overlay can be seen in the graphic above. The soil type areas are
clearly defined, whereas in reality we know where two soil types meet they often gradually
blend into each other. Another good example is using aerial photography to analyse
vegetation. It would be very difficult to draw onto a photograph where one area of vegetation
stopped and another began: the vegetated areas would be blurred into each other and
probably produce a speckled effect in the photograph.

GIS users must never forget that the result that a GIS provides is only as accurate as the
data that was used for the query. Do not forget that rubbish in equals rubbish out.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 52
Spatial querying is used in many different ways to help understand the world around us.
Often the answer to a particular problem can only be unravelled by comparing two layers of
information in a way that would be almost impossible to achieve without the GIS software.
The next page looks at a much more specific GIS application that is increasingly becoming a
part of everyday life.

One of the most dramatic successes of the GIS revolution has been the development of
intelligent navigation systems using digital models of road networks: section 4.5 Show me
the way to go.

4.5 Show me the way to go

4.5.1 Network analysis

One of the most far-reaching applications of GIS is in network analysis. Network analysis is
the mathematical processing of the geometry of a link/node layer, enabling the identification
of all possible routes around that network, along with the distances and times involved. Put
simply, this means that, using an accurate road data layer, the computer can identify
possible routes between two locations and calculate the shortest.

Okay, now we have definitions out of the way, let us remember what we covered in
Chapter 2, section 2.8. Way back then, we discussed the concept of link-node topology and
how important it is to have the data structured correctly (that is links joining at nodes with no
gaps). In order to carry out network analysis you need a link-node data layer of line features
representing a real-world network (for example, a road network); only then is it possible to
model movement around that network.

The simplest example of network analysis is to choose two points on the network and ask the
GIS to calculate the shortest path between them.

This basic concept can be used to help build navigation systems and to plan distribution
services.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 53
4.5.2 In-car navigation

By applying the principles of network analysis to accurate road data, you can build systems
for motorists to navigate. Many cars now have gadgets that provide driving instructions,
either as a simple map display or by audio messages. These in-car-navigation tools are
actually specialised, miniature GIS. Because the data has attributes for road names and
numbers, intelligent instructions can be provided (for example, ‘take left turn A34 at next
junction’).

In-car navigation requires up-to-date map data and extra


information to make the data model behave like the real world:
you need to know which roads are one-way streets or where
there are no-right-turn signs. By using unique identifiers in the
road data so that each link in the network can be pinpointed,
additional information can be built into the system. Furthermore, it
is possible to receive real-time information about traffic conditions
as you drive, so that you get advance warning if there are
hold-ups due to road works or an accident.

4.5.3 Drive-time analysis

Another benefit of network analysis is the ability to calculate drive times, which identify how
far you can travel in a certain amount of time. The typical drive-time map, for example, for
pizza delivery, would show a central point surrounded by a series of circles estimating how
long it takes to get to places within that radius. This method assumes an as-the-crow-flies
route to each location.

A GIS can be much more accurate – it can use network analysis to generate isochrones
(lines that join up points of equal travel time) that take into account the true road network and
give a proper measure of how far you can get over a set time. This can even take into
account the average speed on each road, so that the area appears stretched along faster
roads.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 54
Many different organisations use this kind of drive-time analysis to plan their operations, from
the siting of new stores to the planning of distribution networks.

4.5.4 Optimum-path analysis

Network analysis does not have to be carried out on vector link and node data. As discussed
in the previous section, Overlay operations, raster data can be effective in describing
continuous varied surfaces. This quality of raster data is useful for identifying the optimum
path (path of least resistance or shortest path) through a continuous surface. For example, a
company needs to erect electricity pylons from A to B.

They need to make sure that they disrupt the forest areas as little as possible. The GIS
calculates the path of least resistance, by finding the path that adds up to the lowest value. A
vector line can then be added to show the location of the proposed route.

Finally in this chapter we will look at some examples of how different types of organisation
benefit from the range of functions described in the previous pages: section 4.6 Some simple
GIS examples.

4.6 Some simple GIS examples

4.6.1 How many useful applications can GIS provide?

We have now looked at a wide range of GIS concepts. GIS can be used to speed up any
process that formerly relied upon using paper maps. The analytical functions of GIS mean
that geographical information can be used in unprecedented ways.

It is important to appreciate that GIS does not always provide exact answers to problems, but
by identifying trends based on geography, GIS can reveal patterns that can help us make
informed decisions. A GIS can improve decision-making; it cannot make decisions for us.

On the next few pages are some typical applications to complete the GIS picture. They show
how GIS is helping to improve everyday life.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 55
4.6.2 Flood risk

Using 3-D height data and map data for river features it is possible to build a computer model
of changing water levels; this can be used for predicting flood patterns and identifying areas
in danger. By combining this model with address data, the likelihood of individual properties
being flooded can be assessed. This is not just of environmental concern but of great value
to insurance companies.

4.6.3 Emergency services

By using the GIS as a computerised map, controllers of police vehicles and ambulances can
instantly call up a detailed map of the area around an incident. By tracking the vehicles in
real time and using route-finding GIS functions, the controller can identify the best vehicle to
attend and give directions for the fastest way to the incident. They can even store historical
information and look for incident patterns and black spots.

4.6.4 Estate agents


A GIS makes an excellent system for providing information to potential house buyers about
the houses for sale in a particular area. By allowing selection based on price, number of
rooms, type of house and so on the display can instantly show the range of properties fitting
the requirements of the customer (similar to when we selected restaurants in Chapter 3,
section 3.4). The system can then go on to provide information about the local amenities
such as schools, shops and recreation facilities. Several of these systems are already
available on the Internet; examples are below:
• http://www.propertybroker.co.uk
• http://www.national-property-register.co.uk

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 56
Go to search for a property:
• http://www.home-envirosearch.co.uk

4.6.5 Nature conservation

GIS can be used to record locations as part of a nature conservation project. The value of
the GIS is in providing instant access to information about the geographical spread of
sightings or plantings so patterns can be detected. It also allows for a user-friendly way for
individuals to input information into the system.

An example is below:
• http://www.mammalstrustuk.org

4.6.6 Retail

Supermarket chains use GIS to help site new stores and to plan their distribution networks.
By comparing how many people live within 15-minutes drive time of a particular location with
the number of supermarkets already trading in that area, the GIS can identify suitable
locations with an optimised catchment area. Supermarket chains also use socio-economic
data to create profiles of the people in their catchment areas to help them understand which
other parts of the country are likely to be successful growth areas.

4.6.7 3-D environmental impact analysis

By building a 3-D model of a landscape it is possible to simulate the construction of a new


feature which may have an impact on the natural beauty of an area. For example, planning a
wind farm. By using accurate map data for the area, a realistic model can be created and
viewed from all angles. This will help identify the location that the new wind farm will have the
least impact upon.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 57
4.6.8 Airport-noise pollution

Restrictions on the permissible levels of aircraft noise affect all busy airports. GIS can help
monitor not only the noise itself but also complaints from nearby residents. The spread of
sound from the airport can be mapped against the nearby built-up areas to identify how many
houses are going to be affected by high noise levels. By logging the addresses of people
who complain about noise, the airport can monitor the effectiveness of their noise control
measures and whether or not the airlines are obeying guidelines.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 58
5 Case studies
Chapter 5 refers to other parts of the web site rather than being text in the GIS files. See
http://www.ordnancesurvey.co.uk/business/studies/index.htm for more details.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 59
6 Chapter 6 – Expert GIS concepts
6.1 Introduction
From the embryonic early days of computerised maps, GIS has now grown into a fully-
fledged science in its own right. Different aspects of geography and computing combine to
create a complex and dynamic subject area. GIS as a discipline can now be followed in
higher education and as a professional career. There is a wealth of literature written on the
subject, in journals, textbooks and on the web.

These pages do not attempt to provide a comprehensive, in-depth description of every


aspect of GIS. But in Chapter 6 Expert GIS concepts, we detail some of the more interesting
and current GIS issues. Hopefully, this will emphasise just how far reaching the subject has
become. In Chapter 6 Expert GIS concepts, you will discover the importance of data formats
and standards, the impact of database technology on GIS evolution and how GIS has
adapted to the Internet.

This chapter will also show how a diverse range of new types of map can be generated by
GIS software and, in the section on location-based Services (LBS), why many predict an
explosion in the use of geographic data in everyday life through the medium of handheld
computers and mobile phones.

So read on to learn about a selection of more advanced GIS concepts.

6.2 Data formats

6.2.1 GI data compared to other data

In its most elemental form digital data is composed of bits: indicators that have a state of
either 0 or 1. Information can be encoded in these binary characters. The way in which this
code works varies between systems.

One of the most universal conventions is the organisation into sets of eight bits, called words
or bytes. A byte therefore has a sequence of 0s and 1s in any of 256 combinations, be it
00000000, 11111111 or 01101000. These are essentially the same as the set of numbers in
base 2 equating to the decimals 0 to 255. Streams of bytes can be used to encode all kinds
of information, the power of the computer comes from the volume in which these streams can
be stored, and the speed with which they can be transmitted and manipulated.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 60
The file is a collection of bytes that make up a logical unit of information. In describing types
of file, the terms ASCII and binary are commonly encountered. ASCII (American Standard
Code for Information Interchange) files adopt a convention in which each eight-bit sequence
corresponds to one of a set of common characters. ASCII files are very simple and can be
created using basic text editing tools like Notepad. A binary file encodes information in the bit
sequence. You can only interpret the information contained in it by knowing the code for that
particular file. If a file is described as binary it means that information is encoded in the bit
sequence in some way or another – only by knowing the code for that particular file can you
interpret the information, be it text, graphics, mathematical formulæ, video or whatever. This
is how different software products use different types of file, identified with their different file
extensions (for example, .doc, .tiff, .java and so on). In a sense, all files are binary, but in
common usage the term refers to a file that does not conform to the ASCII convention. Try
this useful link for more information about ASCII codes.

The data used in GIS is no different. It is also organised into files with different software
products using their own particular file types, binary coding and file extensions. The earlier
chapters of The GIS files described how geographical data comes in many different forms.
This fact is reflected in the files that are used to store this data. The range of different file
types used in GIS can be very confusing! Word processing packages have a simple use of
®
file types. With Microsoft Word you store each document in a single .doc file. GIS can be
much more complex. With geometry, attributes, indexes, topology, image and history
information to store, most systems use more than one single file (with multiple file types) to
encode a particular data layer. Chapter 6.1 attempts to put these file formats into
perspective.

6.2.2 Proprietary file types

Every software product is designed to work with a specific set of file types. In essence that is
what the software does: it reads the particular binary code to extract the stored information
and then does something with it, for example, displays it on screen, sends it to a printer or
performs calculations. Commercial software products usually have their own specific binary
formats.

It would be impossible to describe the full range of different file types used in GIS as there
are so many. However, it is important to recognise that each product handles the storage of
information in different ways and, to fully understand what is happening to the data, it is
useful to consider the files involved for your own system, what is going on under the bonnet,
so to speak.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 61
Here are a few examples:
• single file for each layer – in some systems all information for a given layer is stored in a
®
single file (for example, .dxf files in AutoCAD );
• multiple files for each layer – some systems use a series of files for each layer (for
®
example, MapInfo has a .tab file for each table of information, but this is just a pointer to
a set of files containing the geometry, attributes, identifiers and indexes separately); and
• a folder of files for each layer – more complex systems can use a more complicated
®
hierarchy of files held in a specific folder to store the information (for example, ArcInfo
coverages).

In these last two examples you only ever interact with the data through the GIS software
interface. The individual files are not designed to be edited outside the GIS as this will almost
certainly corrupt the data.

GIS can read image data from standard graphics file formats but often need an additional file
®
to register the image in space. Examples of such files are MapInfo .tab and ESRI .tfw.
These are in fact simple ASCII files. If you have any examples on your own computer try
viewing the contents in a text editor; this can be useful to understand how they work. On the
next page we’ll see how ASCII files can be important in the transfer of data between
systems.

6.2.3 Translators and transfer formats

GIS software is designed to work with data stored in specific proprietary binary data formats.
The skill of the software developer is to optimise the system’s performance to include as
many functions as possible while remaining fast and robust.

The binary code used to store the data is critical to the performance of the software and GIS
vendors guard their binary formats as part of their unique intellectual property.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 62
This means, however, that many different file formats exist. To get a feel for the complexity of
file formats, have a look at these file extension source pages. With so many different
software products available, each with their own data formats, users of one system may,
therefore, find it difficult to swap their information with users of another. In the early days of
GIS this was serious; if you used data in one package you could not use the same data in a
separate system from a different vendor. More recently it has become standard for GIS
software to have import facilities that can open files from a diverse range of formats and store
them in the preferred local format.

There has also been an explosion in the development of translator software. Products have
been developed to convert geographical data between a whole array of formats. It would be
unfair in these pages to highlight a specific translator product in comparison to any other. But
it is now possible to convert between practically any of the possible data formats, of which
there are over a hundred, in either direction. Type GIS translators into a search engine and
see the results.

Most of these formats are binary as in this form data is more closely integrated with the
software engineering of the products and can be manipulated more quickly. Codes to work
with such data can only be written if you know the binary format. There is a series of more
simple ASCII file formats that have been developed to enable easier transfer between
systems. The human readable nature of ASCII files means that it is an order of magnitude
more straightforward for other developers to write programs that can read these files. The
MapInfo MID/MIF format is an example of an ASCII transfer format.

Ordnance Survey has traditionally supplied its vector data in ASCII formats with a relatively
simple, documented structure that can easily be understood by developers. Until recently the

main formats used have been DXF and NTF. NTF is also British Standard BS7567, used for
the transfer of geographic data, administered by the British Standards Institution (BSI).

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 63
6.2.4 Open formats

The previous pages in this section have discussed why so many different file types are used
in today’s GIS applications. Traditionally, data providers have supplied data in open ASCII
formats with systems simultaneously loading and translating it into the proprietary binary
format. Data can be exchanged between systems where an import option exists for the
particular formats. There are also bespoke translation tools available to cover every possible
option. Exchange between formats has advanced further in recent years and the term
interoperability has become important. Within a single organisation there can be several
different software products being used and it is imperative that information can be shared
between them.

According to Moore’s Law, the processing power of computer hardware can be expected to
continue improving. It is therefore becoming less critical to the performance of GIS software
for the data to be stored in the optimum format for that particular system. The recent trend for
systems to use non-specific data formats means that data is read from, and written to,
different native formats on the fly as the software performs its functions.

As a data provider, Ordnance Survey has to make careful decisions about the formats it uses
to supply data. Data users want a choice of formats to avoid the need for translation, and
although it is difficult to provide every possible format, excluding just one would be unfair to
that software vendor. This explains the need for standard open data exchange formats that
create a level playing field for the producers of software and translators. The standards being
developed by the Open GIS Consortium are becoming a favoured option, see the XML and
GML page in the next section.

The increasing significance of databases and the Internet is also playing a big role in the
advance of interoperability. Increasingly, systems are being built around the use of
databases to hold the information in each GIS layer, replacing the use of flat files. Proprietary
binary formats are therefore becoming less important. A similar effect is seen in the way that
systems can now read data in real time from central locations on a network, rather than
reading from locally stored files. Although this section on data formats set out to highlight
differences between file types, these issues will probably have a greater resonance for those
practicing GIS in the late 1990s rather than today. There is more to come on spatial
databases and the web in forthcoming sections of Chapter 6.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 64
6.3 Standards
Standards are a fundamental part of modern society; an organised way for ensuring best
practice, common design, safety and many other benefits across every field of industry and
science. A series of bodies exist to coordinate and promote the generation of standards.
These bodies now play a crucial role in geographic information (GI) science.

ISO, the International Organisation for Standardisation, is very active in the field of standards
for GI. The ISO Technical Committee – TC 211 – is in the process of producing standards for
many aspects of spatial data, including metadata; spatial referencing by coordinates; imagery
and gridded data; and data quality. For more details of ISO, and especially ISO/TC 211, see
the TC 211 official web site.

BSI, the British Standards Institution, is the oldest national standards setting body in the
world, including BS 7567 (aka NTF) and BS 7666 (Spatial data-sets for geographical
referencing) – see the Gazetteers page. Ordnance Survey is involved with the BSI’s technical
committee – IST/36 – which is responsible for the UK participation in the area of GI in
international committees. Click here for more details of IST/36.

More recently, OGC, the Open GIS Consortium, has been established. OGC is an expanding
organisation dedicated to the creation of standards in the field of interoperable geospatial
systems. The membership includes most of the GIS and database vendors, several major
spatial data users and a few spatial data producers. OGC develops interface specifications
for geospatial data and systems, and is increasingly involved in prototyping services for
serving and accessing spatial data over the Internet. One of the key areas of activity is
encouraging the adoption of standard formats for geographic data exchange based on
eXtensible Markup Language (XML), see the XML and GML page. The OGC public web site
contains a wealth of information about these activities.

6.3.1 Metadata

Metadata is a word that frequently crops up when discussing GIS. It can be described as
data about data. It can be very useful if files of digital information include additional
information describing the contents of the main part of the file. This concept is inherent in
many commonly encountered file types. For example, many graphics file formats like .GIF or
.JPG contain a header component that does not specify the image itself but describes the
palette of colours present in the image. Similarly, many web pages carry metadata contained
in meta-tags at the top of the file.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 65
Metadata is very important in GIS because many different datasets exist, and it is essential to
know certain things such as who created it and when, the types of feature it contains, the
geographical bounding area and the precision, accuracy and scale. Many standard GIS file
types have a header or separate metadata file as part of the data format, or supply the
metadata as a written report. A good analogy is to liken metadata to the nutritional
information displayed on food packaging.

There has been much effort over recent years in the GI community to create metadata
reference archives so that the full range of available datasets can be identified and made
accessible. By standardising the way in which metadata is stored it is possible to identify
resources that contain common types of information. Therefore, if you are interested in
forestry you can access a metadata gateway and find references to forestry information
stored across the globe.

Try these links: askGiraffe.org.uk (link is now http://www.gigateway.org.uk/


US Federal Geographic Data Committee

6.3.2 Gazetteers

At its simplest level a gazetteer is a dictionary of geographical names. Every record contains
a description of the location, providing the user with a simple means of identification and
reference. An electronic gazetteer works in exactly the same way. It is a file or database
listing every feature of a particular type (such as a building, a road or a pond) within a
defined area. The user can locate the position of the feature and query any additional
information attached to it.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 66
6.3.3 Gazetteers in England and Wales

Recent developments in the UK mean that a definitive national gazetteer of certain key
geographical features is closer to becoming a reality. Reliable and consistent gazetteers are
vital for different parties to be able to refer to and locate a feature easily and without
ambiguity.

The first step towards this goal was the New Roads & Street Works Act (1991), which
required utilities and contractors to notify local authorities of road works. It also specified that
local authorities should jointly maintain a National Street Works Register.

More significantly, in 1993 the first draft of the British Standard BS7666, Spatial data-sets for
geographical referencing, was published.

BS7666 currently contains four parts:


1 Specification for a Street Gazetteer – an up-to-date list of all streets in an administrative
area. Every entry is allocated a Unique Street Reference Number or USRN.
2 Specification for a land and property gazetteer (LPG) – an up-to-date list of all land and
property units in an administrative area. Each record is called a Basic Land and Property
Unit (BLPU) and holds data relating to its provenance. A BLPU also holds a grid
reference locating its central point and a Unique Property Reference Number (UPRN).
3 Specification for addresses
4 Specification of a data-set for recording public rights of way

Useful links: The National Land Information Service


The National Street Gazetteer
The National Land & Property Gazetteer

6.3.4 XML and GML

In general, mark-up languages use tags to associate a rule to interpret the content of a set of
information. In HTML this means the visual formatting and association of HTTP hyperlinks
with text and images. For example, the <font> tag can be used to instruct a browser
application to display a piece of text in a certain style.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 67
In a similar way, eXtensible Markup Language (XML) uses tags to give meaning and context
on the content of a set of information. In an XML document, tags could encode the fact that
Southampton is a city. But alternatively they could state that, in a particular document,
Southampton refers to a node on a network of shipping lines, or in another, a football club.
This is where the extensible bit comes in. In XML you can define your own set of tag types as
long as the tag set applicable to your document is defined in a separate schema. XML
provides for self-describing data. This makes it very useful as a standard format for
exchanging information because computer programs can interpret the content of XML
packets without any prior exposure. XML makes system development more flexible and is
rapidly becoming the standard for information interchange on the Internet. For an introduction
to XML, visit www.xml.com/pub/a/98/10/guide0.html

The emergence of XML has led to the creation of a wide range of mark-up languages
specific to particular subject matter such as maths, chemistry and medicine. There is also a
Geography Mark-up Language (GML) that allows spatial data to be stored and transferred
between systems over a network. It allows points, lines and polygons to be encoded along
with their attributes and the spatial reference system on which they are based, for example,
the National Grid. There is already a lot of interest in GML from other communities, including
the mobile phone industry and the general Internet community. GML is fast becoming the
definitive method of describing geographical data and simple location information on the
Internet. The specification for GML 2.0 can be viewed on the OGC public web site.
Ordnance Survey has adopted GML as the format for OS MasterMap™.

6.3.5 Future standards

Standards are continually being developed and enhanced, and GML is no exception. Many
leading software suppliers, data providers and GIS users are involved in coordinating the
®
way the GML standard is evolving. There are other OpenGIS standards currently under
development that will also play an important role in the next few years.

OpenLS is an initiative for standards in the location-based services (LBS) arena (see
section 6.7). OpenLS is designed to make geospatial data and services more widely
accessible through PDAs and mobile phones. According to OGC, the vision of OpenLS is ‘to
deliver open interfaces that enable interoperability and make possible delivery of actionable,
multi-purpose, distributed, value-added location application services and content to a wide
variety of service points, wherever they might be, on any device’.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 68
OGC have set standards for web mapping that are now being developed in the area of web
services (see section 6.6). The OWS-1 initiative takes the results of previous web mapping
testbeds and develops them further, with the aim of ‘developing interfaces to enhance the
growth of geospatial web services’.

Standards in the wider mobile phone market are also going to be important in the way LBS
evolves. Recent years have seen the phenomenal, and unexpected, success of the short
message service (SMS) for sending text via mobile phone. With the promise of much faster
transfer speeds from GPRS and 3G networks, combined with colour screen smartphones
with embedded digital cameras, the standard for exchange of multimedia messages (MMS)
will be important, especially as this will be the method through which GI will be visualised on
these devices.

The Japanese are a particularly useful example for demonstrating the importance of
standards. Standards are established very quickly in Japan through cooperation between the
standard’s bodies and industry players. This enables new technology industries to be
developed very quickly. The success of the i-mode system in Japan demonstrates this. Since
starting in 1999, this colour screen, Internet access, mobile phone system has grown so
much that by March 2002 i-mode claimed 31.3 million subscribers (25% of the population)
and 53 000 compatible web sites. This is a stunning example of the benefits of standards
and, as you can see, this will be a crucial issue for GIS in the future.

6.4 Spatial databases

6.4.1 Database fundamentals

Section 6.1 Data formats describes different types of data file. Files are the most commonly
used packets of information in the world of the desktop computer. But when the volume of
data becomes very large, or you need to allow many people to access the data at the same
time, it becomes preferable to store the information in a database.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 69
A database is a tool capable of storing large amounts of complex information in a structured
way. Information in a database is organised into individual records that can be referenced,
sorted, indexed, linked and queried. Most computer systems you interact with on a daily
basis have some kind of database behind the scenes. These contain many different types of
information, for example, an ATM showing your bank account details; on-screen flight
information at an airport; products and prices at the supermarket checkout and so on.
Databases are, in principle, more robust, secure and scalable than storing information in flat
files.

Database technology is a very large subject and can only be covered in very simplistic terms
in these pages. In large industrial software systems there will usually be multiple databases
operating together in a database management system (DBMS). There are two distinct major
database types: relational (RDBMS) and object-oriented (OODBMS). This section
summarises the difference between these database types and explains how database
technology plays a big part in GIS.

6.4.2 Relational databases

The storage of information in a relational database is fairly simple. The records of information
are organised into rows and columns in a table, with a separate row for each entity and a
column for each property stored about that set of entities. For example, a table could contain
a list of cars with each entry containing values such as a registration number, colour,
mileage, make or mode. This is essentially the same as a spreadsheet. Each column is
formatted to store a particular type of data, be that text, numbers, dates or boolean. These
tables can be queried just like the earlier examples of selecting data in a GIS from
section 3.5 Using GIS? Be selective. A structured query language (SQL) that is used by
almost all RDBMS to allow interrogation of the data has been developed. For example, the
SQL statement Select * from CARS where MAKE = ‘Ford’ will generate a list of all the Fords
in the original table.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 70
For the computer to understand the data correctly there can be no duplicate records in a
table. In a relational database each record in a table must be uniquely identified for queries
to be meaningful. This usually means that one of the columns should contain unique values –
the primary key. In our example you could assume that car registration numbers will never be
duplicated so this column could be used as the primary key. Often the data does not
inherently contain a primary key, so the database will generate its own set of unique ID
numbers for that table.

The term relational derives from the fact that such databases use multiple tables to store the
information, with data linked together by relationships between these tables. These tables
contain information about like entities and make storage more efficient by avoiding
duplication. For example, in the car table there is no point in storing both the make and
model of each car. You know that every record having Vectra as the model will also have
Vauxhall as the make. You can store the relationship between model and make in a small
separate table and only record the model in the main table. This is a key concept of database
design and is known as normalisation. Database normalisation saves storage space and
makes the data easier to index and analyse. Querying highly normalised relational databases
can become quite complex since a large number of tables may need to be linked together.

Crucial to the efficient querying of a database is the way in which the tables are indexed. You
can create indexes on tables that enable the computer to sort through and answer a query
quickly. One way in which this can work is for a column containing textual data to generate
an index in which the values are sorted alphabetically. The index stores ranges of values so
that to respond to a query the software only has to search through a small subset of the
whole table to find the required records.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 71
6.4.3 Object databases

Object-oriented databases have their origins in the realm of object-oriented programming


languages. This subject is notoriously difficult to explain in simple terms and it may be
advisable to seek more detailed resources and devote some time to fully grasp the concepts
of object-orientism.

OODBMS organise information very differently than RDBMS. Rather than spreading the
information about an entity across a range of linked tables, it is stored together in discrete
lumps called objects. Each object is defined within a hierarchy of object classes so that it
inherits properties from a parent class. Additional attributes can be defined within the object
and they are said to exhibit encapsulation because they can self-describe their own particular
set of properties, and therefore the way in which they can be queried. Object-oriented
databases can make it easier to model real-world phenomena in a logical form.

There is a particular problem for the GIS student learning about object databases: the use of
the term object. The situation gets clouded by the fact that geospatial data corresponds to
real-world objects. In GIS the shapes and locations of things are stored as coordinate
geometry. GIS data is often stored in a database, either storing the coordinates as numbers
or using special geometry data types. You will hear GIS practitioners refer to an object
database to describe any database that can store the geometry of topographic objects in its
tables. To a practitioner of pure computer science the distinction between relational and
object databases has nothing to do with geography. To make things worse, there is a hybrid
type of database called object-relational in which advanced data types can be stored in
relational tables that reproduce some of the advantages of the object model. In theory, you
can store object geometry in each of these database types: relational, object and
object-relational.

As you can imagine, it is very important when using this jargon that you know exactly what
you mean by an object! The term spatial database is better for specifying the storage of
geographical features.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 72
6.4.4 GIS and databases

As previously mentioned, many implementations of GIS use a database rather than simple
files for the data storage. Geographical datasets can be extremely large and so the benefits
of database storage are as applicable to GIS as to any other kind of system. The elements of
data security, the ability to cope with large data volumes and the accessibility to multiple
users are equally important. The explosion in the availability of cheap, high-volume disk
space has fuelled the proliferation of large databases.

Many large organisations maintain vast enterprise-wide information systems that incorporate
different types of geographical objects. For example, a utility company will use GIS to store
information about its pipe networks, the location of its customers and the location of its
maintenance teams. This information will need to be continually updated as it is much easier
to lock the record for a single feature, perform edits and then perform the update if the
features are stored in a database rather than in a file. The database also allows the various
departments to view the information in different ways.

All major GIS software vendors provide tools to enable database storage. Database
management can be a complicated and specialist task, so products are developed to provide
a user interface similar to that of a regular desktop GIS but which also handle the database
administration side. Such products are often referred to as middleware. Middleware is usually
designed to operate across a range of the most popular database products such as
® ® ® ®
MS-Access, SQL Server, Oracle , Sybase , Ingres and IBM . More recently, the database
software companies themselves have been producing extensions to the standard
functionality that allow for the storage of complex data types, for example, coordinate
geometry, raster images and terrain models. This is a telling sign of how GIS has become
recognised as a key component of information technology.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 73
6.4.5 Advanced database technology

Databases are now used in sophisticated ways that go beyond the simple storage of
information and its retrieval by structured queries. Two techniques that are often encountered
in the study of GIS are data warehousing and data mining. Databases are designed as part
of a particular system, only storing the information needed to make that system work.
Organisations end up with many databases serving different functions and often this data
can contain information that has value beyond the purpose of the individual systems for
which they were designed. The centralised gathering together of diverse sets of information
stored within an organisation is known as data warehousing.

Techniques have emerged that automatically scan the information held in a data warehouse
to identify possible relationships between data items. This is known as data mining, which
can reveal phenomena that may otherwise remain undetected. Terms you will hear often
associated with data mining are regression, classification and clustering. Mostly, data mining
concerns a statistical analysis of the contents of the data warehouse to identify
commonalities and patterns. For example, regression refers to the mathematical analysis of
numerical data to identify a formula that best fits the trends in the data. If successful this can
enable successful prediction of future results.

Another feature of databases that is very important to their application in GIS is indexing. The
way in which indexes speed up the response to queries has already been described. This
becomes very important when performing geographical search queries as it is possible to
generate spatial indexes that break down the space occupied by features in the table and
sort them into a hierarchy similar to the alphabetical sorting of text values. In response to a
request to find all objects that intersect a polygon, it can be quicker to find a subset near that
polygon first and then analyse each object of this subset more accurately to find those that
actually intersect.

Indexing is important because spatial queries can be very complicated and time consuming.
If you are trying to select all features lying within a county boundary you could be checking
from many thousands of records against the shape with thousands of vertices, a very
convoluted geometric algorithm. Spatial databases that contain features in three dimensions
are starting to be developed – for example, to store a building as a 3-D volumetric object, not
just a planar polygon shape. The generation of spatial indexing for three-dimensional space
presents interesting challenges!

Finally in this section, a mention of another key challenge – the storage of data in the fourth
dimension. GIS databases designed to store information about real-world objects and how
they change over time are called spatio-temporal. To truly reflect real-world changes in data
form, a GIS needs to maintain historical records. The simplest way of achieving this is to
keep copies of the data at intervals to create a series of time slices. More ideal, but harder to
achieve, is to archive each feature every time a change is made to it; this means you can
answer a query for any moment in time.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 74
6.5 Derived mapping

6.5.1 Generalisation

Map generalisation is the process of reducing the scale and complexity of map detail whilst
maintaining the important elements and characteristics of the location. When creating a map
using traditional manual techniques, a cartographer aims to achieve a balance between the
amount of real-world information required to make the map useful and avoiding confusion for
the user. This is a time-consuming and expensive process.

GIS has led to the realisation that the efficiency of the cartographer could be increased
through the automation of some of the more time-consuming techniques such as line and
polygon simplification. Current off-the-shelf GIS software packages contain tools that allow
basic generalisation to be performed. An example of polygon generalisation is shown here.
Merging and simplification have been used to produce a cartographic representation of the
original data.

Generalised data Source data

Probably the most famous line generalisation algorithm was developed by Douglas and
Peucker in 1973. The Douglas-Peucker algorithm simply filters the number of vertices along
a digitised line to create a representation suitable for the specified depiction scale.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 75
Although these algorithms go some way to help in the automated production of smaller-scale
maps, generalisation technologies are very much in their infancy. The challenge of replacing
an experienced cartographer with a computer that can make the same decisions to produce
a map is significant.

The main problem that needs to be addressed in generalisation is how to resolve the conflict
between different map features when they are displayed at smaller scales. As there is not
enough space to display all of the information in an uncluttered manner, methods to typify the
data in an intelligent, consistent and coherent way at smaller scales need to be developed. It
is for this reason that the move from map generalisation as a manual art to a computerised
scientific process is a distant dream. For the foreseeable future the process will be a semi-
automated collaboration between cartographer and machine.

6.5.2 Text placement

It is difficult to place text on a map so that it is both legible and clearly associated with the
feature that it is annotating. Text placement refers to the complex challenge of achieving this
in an efficient manner to generate high-quality results. Text can of course be placed by
simple manual methods, although this is a time-consuming and inconsistent process. The
automatic generation and placement of text can result in savings in time and labour together
with a more repeatable result. Although seemingly simple in concept, this automatic process
is remarkably difficult to achieve in practice and is the subject of widespread research
interest.

The manner in which text is placed on a map depends largely upon the cartographic symbols
that are chosen to represent the points, lines and polygons of the source data and how they
relate to each other in a spatial context. One layer’s text or symbols may have a dramatic
impact in determining the placement of another layer’s text. It is therefore necessary to
assemble all the data layers required within the final map, then symbolise their features
according to the map’s extent and scale before text placement takes place.

Buildings and text Roads and text (in red)

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 76
• Modern automatic text placement software offers flexible placement options. There are
now choices of font styles, sizes and colours; preferred location of a piece of text;
weightings as to which text is more important and takes preference over other text (for
example, road text might be more important than building text); and minimum and
maximum allowable distances between different labels. A predetermined set of rules can
therefore be created applying to any source data for any location at any given scale. This
results in a map product that is generated more consistently and also more efficiently,
thereby greatly reducing the amount of manual effort required in its production. See
below for an example of automatically generated text.

6.5.3 Automated cartography

In Chapter 2 we showed how the appearance of vector data can be readily altered using
symbols and different line styles. In the two previous sections of this chapter we examined
the more advanced concepts of generalisation and text placement. These ideas, and more
besides, are all relevant to the automatic generation of products from source GIS data that
are meaningful and useful to us.

When we talk of automated cartography, what we are trying to achieve is a fundamental GIS
goal of capture once, use many times. In other words, it is inefficient to go through the
process of manually creating an aesthetically pleasing map every time something changes.
It’s far more desirable to automatically represent and display the source data as often as
required and in an infinite variety of ways. Modern GIS software can be used to rapidly and
efficiently generate highly complex maps from basic point, line and polygon features.

Source vector data Derived mapping

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 77
Automated cartography can become a highly sophisticated business. In addition to
overcoming the problems of scale differences and placing text appropriately, we may also
wish to generate different kinds of map for different users. For instance, to create a map
oriented in the direction in which someone is travelling or with colours that are not affected by
one person’s colour blindness.

• Electronic data and electronic displays enable new forms of cartography to be


developed. For instance, standard data formats such as Virtual Reality Mark-up
Language (VRML) allow maps to become virtual three-dimensional worlds that you can
explore as if you were flying through the landscape. Furthermore, geographical data is
not necessarily best represented as a map. In many cases we are more interested in
direct information, such as navigation instructions, which might be delivered as text or as
synthesised voice. Where will this lead? Smelly and tasty GIS?

6.5.4 Data from imagery

Imagery – usually from aerial or satellite sensors – is widely used on GIS platforms as a
backdrop to vector mapping. An image may contain an abundance of visual information that
is not conveyed by the points, lines and polygons of a vector map. As far as GIS software is
concerned, however, an image is a dumb background. A key research challenge is to derive
vector objects from imagery.

An image is a raster dataset: it is a grid of squares or pixels. Each pixel has a numerical
value that may relate to colour, height or indeed virtually anything measurable.

Numerical values Colour representation

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 78
Human interpretation is often used to derive data from imagery. An operator traces lines over
the on-screen image in a technique known as heads-up digitising. This process remains very
labour intensive, however, and significant efforts are being made to find ways to automate it.

Aerial Imagery Digitised Buildings

One approach is to look for abrupt changes or discontinuities in the image that will equate to
a line feature in a map. This can be achieved using an edge detection algorithm that applies
a mathematical function to each pixel and its immediate neighbours in turn. The result is an
image of lines that can simply be converted into vectors. These lines are often very messy,
however, and this method is best used where the discontinuity itself is distinct and separate.

An alternative approach is to use software to look for similar clusters of pixels and thereby
classify the image into distinct areas. Where successful, this will identify real objects such as
buildings, fields and bodies of water within a classified image, which may then be converted
into a vector map. Accurately and appropriately deriving vector data in this way is a complex
activity at the forefront of research. The ability to automatically generate a map from an aerial
photograph or satellite image is a holy grail of GIS because it would help make data far more
inexpensive and up to date.

Imagery Derived Area Map

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 79
6.6 Web GIS

6.6.1 Simple maps in web pages

The Internet and the world of computerised maps are tailor-made for each other. Maps are all
about the visualisation of information; the Internet is all about the accessibility of information.
The World Wide Web (WWW) is founded on the exchange of simple files carrying a mixture
of text and images. The form and content of these pages are encoded in HyperText Markup
Language (HTML), which can impart certain elements of behaviour. There are literally
millions of web pages out there containing map images. Although this on its own does not
really constitute GIS, there are some features of HTML that, when used in conjunction with
map-based image content, can replicate some simple GIS-like functions within a standard
web page.

HTML provides several different ways of presenting information and embedding links to
further related pages of information. The simplest form is the textual hyperlink. Images can
also be used as a hyperlink. The HTML image map tag enables the attachment of hyperlinks
to specific portions of an image. HTML image maps are very popular as they can work with
any image showing a set of objects, each with its own link to further information. These are
very commonly used with maps and give different results depending where you click on the
map (see example below).

Areas of the image


Samples of images that appear when image map is clicked on.
map

The use of image maps is fairly crude and they can be confusing with too many hotspots.
Another rather more precise way is to use an image input element on an HTML form. This
acts to submit the form, with the pixel location of the mouse click being passed to the next
page as a pair of variables (this requires some server-side scripting). If you know the grid
coordinates of the real-world extent of the map, and the dimensions of the image itself in
pixels, you can generate GIS-like events based on the click location. The example below
shows how this can work.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 80
6.6.2 Internet mapping sites

The examples on the previous page show how some GIS-like functions can be recreated
using simple web authoring techniques. They are very useful because they can be built using
standard web technology and do not require any specialist mapping software. They have one
major drawback, however, as you are restricted to a single static map image; you cannot
change the view of the map. It is much more exciting to be able to generate a new map
dynamically based on choices made by the user.

There has been a remarkable explosion in the last few years of web sites offering the
creation of maps for anywhere in the world, to any level of detail, based on user-defined
parameters. The basic function of these sites is fairly standard: the user has a range of
options for selecting a location, including place name, postcode, full address and grid
location. There will be a zoom function and an ability to move around at a given scale in any
direction. The key thing is that the map image delivered to the page is generated dynamically
on the server; it is not a pre-prepared static image.

The range of Internet mapping sites is very diverse: some use purpose-built software, some
use off-the-shelf Internet GIS software, some use raster imagery, some generate custom
maps from vector data, and some provide printer-friendly versions. For some examples see
the list of Ordnance Survey Licensed Partners offering web mapping.

6.6.3 Internet GIS software

The Internet mapping sites described on the previous page primarily exist to generate user-
defined maps. A large amount of map data is stored on the web server, a request for a map
is received from a client terminal, a custom image is then generated by software on the
server and this image is delivered back to the client within a web page. The software may
differ but the end result is a map to look at or print.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 81
The world of the GIS user is also being revolutionised by the Internet. All of the leading GIS
software vendors now have products that adopt the client-server architecture. This means
the software and data resides on a web server and multiple-client applications access the
GIS processing functions across a network. This model is overtaking the use of multiple
desktop software installations, especially in large organisations. It can even work across the
WWW, and there are many sites that do more than just generate location maps. You can
have GIS tools on the web page accessing the full range of GIS functions on the server:
selection by attribute, spatial query, thematic mapping, data editing or 3-D visualisation. A
major advantage of this model is that the centralised map data only has to be stored and
maintained in a single location, meaning users are always viewing the most up-to-date
records.

With Internet GIS there is always a trade-off between the sophistication of user tools and
response times. Any system that uses the Internet is constrained by the download speed of
the connection. The smaller the amounts of data being used and the simpler the client user
interface, the faster the application. Internet GIS products differ in the way this balance is
approached. Some systems use a very simplistic user front end and display the results of the
server-side process by delivering a simple raster image. This means that the applications
tend to be fast and robust and will work within a standard browser. Other systems require a
client-side plug-in to be downloaded to give richer functionality to the user. Larger amounts of
data can be downloaded from the server to the plug-in, which can make the system work
more slowly; the benefit, however, is a more sophisticated set of user tools and greater
interaction with the data. The choice needs to be made based on the specific requirements of
the application and the expertise of the user community.

6.6.4 Web GIS futures

One of the ultimate goals of the GI industry is to have full interoperability between web-based
geographic datasets enabling information stored at different locations on the web to be
viewed together in single applications (see section 6.3 Standards). With many of the major
current GIS products, not only can you access web-based client server versions of the
software but the standard desktop software can also load files stored centrally on the
Internet. So you could be looking at your locally held map files and then overlay a layer read
from a universal resource location (URL) on the WWW. The full vision of interoperability will
have web-based applications that can read all data files from any location on the Internet,
irrespective of data formats or the software being used.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 82
As well as reading a data layer from a URL it is also possible to submit specific queries
(requests) and receive back one or more individual elements. This is known as feature
serving. For example, a GIS application could submit a query to a web server requesting
information about a district boundaries layer. The response could be a list of the district
names held in a particular table. If the client then submits a request for a particular district,
the boundary polygon geometry can be returned in response and the individual feature is
served to the client application for display, analysis or download. The OGC is very active in
establishing standards for map and feature serving on the web, see section 6.3 Standards.

The model by which information is exchanged between systems across the Internet, with
small packets of data being returned in response to specific queries, is becoming more
pervasive in all areas of computing. Information providers can establish web services in
which a single data store is created and standard open programming interfaces are
published and made available to system developers.

This means that a system can be developed that can call in exactly the required pieces of
information at exactly the time they are needed, rather than having to maintain many multiple
copies of the same data, which can soon become out of date and degraded. Fuelled by
increased bandwidth, better security and adoption of standards, the Internet has moved from
the periphery to become a fundamental component of real-time IT architectures. The world of
GIS is no exception to this trend and many GI web services are being developed and made
available, replacing the situation of organisations having to obtain large volumes of map data
to manage themselves.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 83
6.7 Mobile GIS

6.7.1 Positioning

Everyone has experienced the feeling of being lost. Positioning is the process of gaining
information about our location. This can take many different forms and often it is enough to
know which town, street, house number or room. Our location might also be specified in
terms of latitude and longitude, or in metres north and east of a false origin, as in the case of
a national grid.

A Cartesian coordinate system such as in Chapter 2, section 2.4.3 is a grid, with one corner
being the arbitrary ‘false origin’ (0,0) and all positions on the grid measured as distances
north and east of it. British National Grid is one example.

For example, the position of Southampton in British National Grid (BNG) coordinates is
440 000 metres east of the false origin and 100 000 metres north.

Although coordinates are essential, most people are usually more interested in which house
and street to find someone. Coordinates are therefore often linked to a computerised map
showing information that can be interpreted by a user to allow functions such as route
planning and querying the user’s current location.

Many technologies are available for positioning, but perhaps the best known is the US Global
Positioning System (GPS) (See Chapter 2, section 2.3.2). GPS receivers provide location
information as a set of coordinates in latitude and longitude format.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 84
24 GPS satellites orbit the Earth, providing constant position information to commercially
available receivers on the ground.

The receiver calculates its location from distances measured to satellites orbiting the Earth.
The receiver picks up a digital signal transmitted by the satellites and also measures the time
taken for the signal to arrive. Since the signals travel at the speed of light, the receiver can
calculate how far away the satellite is by calculating distances from at least four satellites and
simultaneously knowing their exact positions in the sky. This gives the position of the receiver
accurate to within about 100 metres anywhere on Earth; however, signals from more
satellites and various techniques can achieve accuracy below one centimetre.

Originally GPS was intended to provide position information for the US military, but today
GPS has a multitude of civilian uses, from surveying to the provision of LBS, which is what
will be discussed in the next section.

6.7.2 Location-based services (LBS)

LBS are a relatively new concept in GIS. Although maps have been around for centuries, a
map that can automatically find your nearest Italian restaurant, plan the most interesting
route there, tell you what is currently on the menu, and allow you to place an order to be
ready on arrival is a major advance. This is the promise of LBS and the reason why it is such
a hot topic today.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 85
Companies developing such services have had a hard time selling them to customers, and
this may be for many reasons. Two of these may be a lack of understanding of what LBS is
and the cost of the technology involved. The cost is coming down fast, however, and more
and more people are beginning to understand what LBS can do for them. Today LBS
applications are even available on some mobile phones, making them accessible to
everyone.

Future LBS applications might:

Direct you to the nearest service station when your vehicle’s engine management system
detects a problem, warn the station that you are coming so they have the necessary parts
ready for when you arrive, order a taxi for you, and let your colleagues or family know you will
be late.

Replace paper maps with electronic devices that not only allow you to see the area you are
in but pinpoint your position, allow you to select map features and display information about
them, plan routes, guide you with voice commands and recalculate routes automatically if
conditions change.

Alert emergency services the moment a medical problem is detected, and direct them to your
home via the quickest possible route, taking into account traffic conditions and drive
restrictions. It could even alert them to what the problem is so that the appropriate treatment
can be given immediately on arrival. Meanwhile, your calendar appointments could be
automatically rearranged, leaving you worry free during your recovery.

The common theme running through all these suggested applications is the location of the
user. It is this single parameter that defines LBS as a useful technology for the world of
today.

6.7.3 Personal and vehicle navigation

LBS, as discussed in the previous section, rely on information about the environment around
them combined with information about location, to provide users with specific services. One
application of LBS is personal navigation, allowing you to find your way around without
getting lost. Some personal navigation systems are better than others. The most commonly
used navigation devices are a map and a compass; these however, have their limitations.
Maps can be inconvenient when unfolded, only provide as much information as can be
reasonably fitted onto them and cannot be viewed at different scales. The compass only
provides information about direction and then only when the user knows how to use it
correctly.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 86
Today technology has all but replaced both of these instruments. Mobile devices allow the
user to view maps on screen, and GPS has replaced the compass by providing position,
speed and height data as well as direction information. In-vehicle navigation is one area
these devices have found an immediate niche. Many models on the market now sport in-built
satellite navigation systems, meaning that the car has a built-in GPS system and navigation
screen on the dashboard. Some systems even speak to you in a friendly voice, telling you
when to make a turn, how far you have to go, letting you know that you have taken a wrong
turn, and welcoming you to your destination. Only a few years ago such technology was
expensive, not very effective and the stuff of research; today it is commonplace, relatively
cheap and increasingly usable.

In-vehicle satellite navigation systems usually consist of the hardware – comprising GPS,
screen and computer system to drive it – and software – comprising a set of maps and a
computer program usually obtained on CD-ROM. As new roads are built, the CD-ROM can
be updated to the latest version, or, if going abroad, the CD-ROM for the destination country
could be bought. Future systems might be able to use mobile phone technology to download
maps of the area the vehicle is in, in real time from a server, eliminating the need for
CD-ROMs altogether.

6.7.4 LBS for the mass market

As we have seen in the previous sections, LBS have many applications, and their appeal
might lie both in the company environment where such services can be used, for example, to
route delivery vehicles effectively or to help people find their way to their nearest restaurant,
ATM or movie theatre. Once the domain of companies who could pay for it easily, LBS
technology is now accessible to almost everyone, thanks to the mass production of low-cost
mobile devices that almost everyone can afford.

Perhaps LBS will mainly appeal to the mass market because it solves everyday problems.
Since location is such a basic parameter in almost everything we do, a multitude of uses
could be found for LBS; many of these are still waiting to be discovered. There is a huge
potential for creative companies to come up with new and innovative services for everyone to
use their LBS device, be it a Personal Digital Assistant (PDA) or mobile phone. LBS could
save you time and money – even your life.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 87
6.7.5 Telematics

Telematics is the application of location sensing, digital information and wireless


communication to solve certain types of problems for vehicular applications. Issues such as
safety, mobility or convenience can often be tackled in this way.

For example, consider a congestion management system for vehicles within a city. Each
vehicle carries a GPS unit, determining the vehicle’s location, and a radio transmitter sending
this information to a central base. The city centre could be designated as a chargeable area,
so that drivers of vehicles entering the city centre automatically have a charge debited from
their account, to encourage them to take an alternative route or mode of transport. Moreover,
the information could be used to determine the locations within the city where there are the
most cars, their speeds, and consequently the most frequent traffic jams. This information
could then be used to ensure other drivers stay clear of the area.

Another application is the area of automatic vehicle navigation. Although still within the
realms of research, the technology required for this is available today. Automatic vehicle
navigation would allow vehicles to effectively drive themselves with the aid of sensors in both
vehicles and on the road. Vehicles could be made to drive a predetermined route to their
destinations without the need for a driver, guided only by a computer and a GPS receiver
transmitting the vehicle’s location to a central server. The server would manage all the
vehicles on the road, eliminating traffic jams and ensuring that all vehicles stay within speed
limits and get to their destinations on time, safely and in complete comfort.

While this sounds futuristic, other telematics applications are already in use. Roadside
assistance can be dispatched immediately a vehicle breaks down, with no need for the driver
to explain where they are. Emergency services can be dispatched to the scene of an
accident as soon as a vehicle’s airbag is released. Fleet management systems can display
the locations of all vehicles in a fleet through the use of GPS, so that an operator can
dispatch vehicles to locations quickly. In the same way, vehicles can be tracked if stolen with
the vehicle’s location transmitted to the police or the owner’s mobile phone and all with the
help of positioning technology, mobile technology and GIS technology. This is telematics.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 88
ArcInfo and ESRI are trademarks of Environmental Systems Research Institute, Inc.
AutoCAD is a registered trademark and DXF a trademark of Autodesk Incorporated. IBM is a
registered trademark of International Business Machines Corporation. Ingres is a registered
trademark of Computer Associates International, Inc. MapInfo is a registered trademark of
MapInfo Corporation. Microsoft is a registered trademark of Microsoft Corporation. OpenGIS
is a registered trademark of Open GIS Consortium, Inc. Oracle is a registered trademark of
Oracle Corporation. Sybase is a registered trademark of Sybase Inc.

The GIS files


v2.0 Mar 2003 © Crown copyright
Page 89