You are on page 1of 164

Contents

Constants 7
Formulae Sheet 9
Stars 13
1: The Sun 13
1.1. Outline of content 13
1.2. Light 13
1.3. Spectra: evidence for hydrogen 15
1.4. Gravity vs Pressure 20
1.5. The central engine: nuclear fusion 25
1.6. The outer layers of the Sun 30
1.7. Solar Neutrinos 33
2: Star Birth and evolution 35
2.1. Outline of content 35
2.2. The Hertzsprung-Russell diagram of stellar evolution 35
2.3. Star Birth 37
2.4. The Main Sequence 42
2.5. Red Giants and Horizontal Branch Stars 45
2.6. Summary 48
3: Star Death 51
3.1. Outline of content 51
3.2. Pulsating Stars 51
3.3. Planetary Nebulae 57
3.4. White Dwarfs 61
3.5. Supernovae 65
3.6. Black Holes 67
3.7. Summary 68
Galaxies 71
4: THE MILKY WAY AND OTHER GALAXIES 71
4.1. Outline of content 71
4.2. The distribution of the stars 71
4.3. Dust extinction and the true shape of the Milky Way 76
4.4. Island Universes 79
5: The cosmic distance scale 81
5.1. Outline of content 81
5.2. Distances inside the solar system 81
3
4

5.3. Parallaxes of the nearest stars and the parsec unit 83


5.4. Stars as standard candles 84
5.5. Nearby galaxies : Cepheid variables 86
5.6. Distant galaxies : the supernova method 87
5.7. Distant galaxies : the Tully-Fisher galaxy rotation method 88
5.8. The scale of the extragalactic universe 89
6: The contents of galaxies 91
6.1. Outline of content 91
6.2. Stars of different sizes 91
6.3. Dust 98
6.4. Local Mass Census 101
7: The structure and dynamics of galaxies 103
7.1. Outline of content 103
7.2. Galaxy Types and Structures 103
7.3. Quantitative Morphology 105
7.4. Disc rotation and masses 107
7.5. Random star motions and interactions in spheroids 110
7.6. Galaxy collisions 113
7.7. Galaxy components summary 114
8: Active Galaxies 115
8.1. Outline of content 115
8.2. Properties of Active Galactic Nuclei (AGN) 116
8.3. Central masses in AGN 120
8.4. Dark masses in normal galaxies 122
8.5. Power from black hole accretion 124
9: The geography and history of galaxies 129
9.1. Outline of content 129
9.2. Galaxy number counts 129
9.3. The galaxy luminosity function 130
9.4. Galaxy clustering 131
9.5. Galaxy formation (non-examinable) 134
Cosmology 137
10: The Expanding Universe 137
10.1. Outline of content 137
10.2. Olbers’ Paradox: why is the night sky dark? 137
10.3. Hubble’s Law: everything is moving away from us 138
10.4. Is the Universe finite or infinite? Does it have a centre and an edge? 140
11: The Big Bang and Inflation 141
11.1. Outline of content 141
11.2. The Big Bang 141
11.3. Critical Density 142
11.4. The cosmic scale factor and redshift 142
11.5. Open, Closed and Flat Universes 144
11.6. Skeptical Sam on the Big Bang model 145
11.7. Guth’s Inflation solution 146
5

12: The Cosmic Microwave Background 147


12.1. Outline of content 147
12.2. Matter and Radiation 147
12.3. Radiation at the era of Recombination 148
12.4. Cosmic Microwave Background 148
13: Nucleosynthesis 151
13.1. Outline of content 151
13.2. Chemical building blocks 151
13.3. Deuterium 152
13.4. Nucleosynthesis and the baryon density 153
14: The Dark Side of the Universe 155
14.1. Outline of content 155
14.2. Dark Matter 155
14.3. What is Dark Matter? 157
14.4. Supernova and the accelerating expansion of the Universe 158
14.5. What is Dark Energy 161
14.6. A concordant cosmology 161
14.7. The Future of the Universe 162
15: Summary of important things to remember 165
Constants
We provide this list of constants and conversion factors for easy reference. In the Introductory
Astrophysics exam, you will be given a constant sheet that contains any information from this
page that you might need. Note that all these constants are all given to two decimal places of
precision. Some constants of course are much more accurately known than that, but you rarely
need more precision.

Planck’s constant h = 6.63 × 10−34 Js


Coulomb’s constant kc = 8.99 × 109 N m2 C−2
Gravitational constant G = 6.67 × 10−11 m3 kg−1 s−2
Boltzmann’s constant k = 1.38 × 10−23 J K−1
Stefan-Boltzmann constant σ = 5.67 × 10−8 J s−1 m−2 K−4
Electron & Positron charge e = 1.60 × 10−19 C
Electron & Positron mass me = 9.11 × 10−31 kg
Proton & Neutron mass mp = 1.67 × 10−27 kg
Mass of Helium M(He) = 6.64 × 10−27 kg
eV to J conversion 1eV = 1.60 × 10−19 J
Mass of the sun M = 1.99 × 1030 kg
Radius of the sun R = 6.96 × 108 m
Luminosity of the sun L = 3.83 × 1026 W
Average distance between the Earth and Sun 1AU = 1.50 × 1011 m
Distance conversion: Parsec to metre 1pc = 3.086 × 1016 m
Speed of light c = 3.00 × 108 m/s

You will be given a constant sheet that contains the information from this page in any Introductory
Astrophysics exam.

7
Formulae Sheet
Maths
Calculus: Differentiation

• If x = y, then dx
dy
=1
2 dx
• If x = y , then dy = 2y
• If x = y n , where n is a constant, then dx
dy
= n y n−1
• If x = uv, where u and v are functions of y, then dx dy
dv
= u dy + v du
dy
.
ay dx ay
• If x = e , where a is a constant, then dy = ae .

Calculus: Integration (where n and c are constants)


Z
dx = x + c

x2
Z
xdx = +c
2
xn+1
Z
xn dx = +c
n+1

Examples of manipulating powers of 10 and log rules


(1020 )2 = 10(20×2) = 1040
10−11
= 10(−11−20) = 10−33
1020
log(ab ) = b log a

Physics
Area of a circle
A = πr2

Volume of a sphere
4πr3
V =
3

Density
m
ρ=
V
9
10 INTRODUCTORY ASTROPHYSICS

Small angle formula where θ is in radians


r = Dθ

Newtons Second Law


F = ma

Speed, distance (D), time (zero acceleration)


D
v=
t

Speed, distance (s), time (non-zero acceleration a)


1
s = ut + at2
2

Gravity
GM m
Fgrav =
R2

Coulomb Force
|q1 q2 |
Fc = kc
r2

Pressure Force
Fpressure = P A

Centripetal Force of an object of mass m moving in a circle of radius r with speed v


mv 2
F =
r

Momentum of a particle of mass m moving at speed v


p = mv

Kinetic Energy
1
K = mv 2
2

Gravitational Potential Energy


−GM m
U=
R

Elesctrostatic Potential Energy


e2
EPE = −kc
r

Wavelength and Frequency


c = λf
INTRODUCTORY ASTROPHYSICS 11

Photon Energy
E = hf = hc/λ ∝ 1/R(t)

Momentum of a photon of wavelength λ


h
p=
λ

Matter Energy
E = mc2

Ideal Gas Law


P = nkT

Thermal energy, per particle


3
E = kT
2

Astronomy
Distances and Angles
D(pc) = 1/θ00

Flux, Luminosity and Distance


L
F =
4πD2

Blackbody radiation
λmax (µm) = 2900/T
L = 4πR2 σT 4

Telescope size D and resolution θ


1.22λ
θ=
D

Redshift and recession velocity


∆λ Vr
= =z
λ c

Hubble’s law:
V = H0 D

The cosmic scale factor


r(t)
R(t) =
r(t = today)

The density parameter


ρ
Ω0 =
ρcrit
12 INTRODUCTORY ASTROPHYSICS

The rotation velocity of stars a distance R from the centre of a galaxy is given by
r
GM (< R)
v=
R
Stars
1: The Sun

1.1. Outline of content


• Light
• Spectra: evidence for hydrogen
• Gravity vs Pressure
• The central engine: nuclear fusion
• The outer layers of the Sun
• Solar Neutrinos

1.2. Light

It’s there rising when you wake up in the morning, and there setting at the end of the day. You
can even see it reflected off the moon at night: sunlight. The simplest and first astronomical
observation you ever made is that the Sun emits light. In this section we ask the question; what
is light?

Figure 1. In nature there exists an electric force field around charged particles (left) and a magnetic
force field around magnets (right) and moving charged particles. Images from the BBC

In nature there exists a number of fundamental force fields such as gravity and electromagnetism.
The electromagnetic force field can be split into two parts; an electric field which surrounds
charged particles (see the left panel of Figure 1) and a magnetic field which surrounds magnets
13
14 INTRODUCTORY ASTROPHYSICS

(see the right panel of Figure 1) and moving charged particles. Just as the force of the Earth’s
gravitational field keeps us stuck to the ground, the force of an electromagnetic field will force
charged particles to move.

Figure 2. Light is an‘electromagnetic wave’ that transports energy through space by varying the
electric (red) and magnetic (blue) field. Image from Georgia State University, Hyperphysics page

Light can be thought of in two ways. The first ‘classical’ description is that light is an electromagnetic
wave that transports energy by varying the local electric and magnetic field as it travels through
space (shown schematically in Figure 2). This description is analogous to a water wave rising
and falling as it moves through the ocean. With light, it is the strength of the electric and magnetic
field that is rising and falling, and indeed changing direction, as the light travels along. The wave
is described by its wavelength λ which measures the distance between two peaks or two troughs
in the wave. Light travels at a fixed speed c = 3 × 108 m/s and oscillates at a frequency f given
by

c
(1.1) f= .
λ

The second ‘quantum-mechanical’ description is that light acts like a series of packages of
energy, called photons, which act like particles. The energy Eγ of each photon in the light-particle
description is related to the wavelength of the light in its classical description by

hc
(1.2) Eγ =
λ

where h = 6.63 × 10−34 J s is Planck’s constant.

Which description is correct? Is light a particle or a wave? The correct answer to this question
comes from quantum theory; light can act as both a wave and a particle. Furthermore, quantum
physics tells us that particles of matter also have dual properties and can act as both a wave and
a particle.
1: THE SUN 15

As we’ll see regularly throughout this course, mass and energy are equivalent and we can
therefore also consider each photon of light to carry momentum1 even though photons are
massless!! If we re-arrange the famous E = mc2 to get a ‘pseudo-mass’ for our photon mγ =
Eγ /c2 , then the photon, travelling at speed c will have momentum p = Eγ /c. Combining this with
equation 1.2 we find that the photon’s momentum is given by
h
(1.3) p=
λ

Figure 3 depicts the electromagnetic spectrum of light, from low energy radio waves through
to high energy gamma waves. The vast majority of information that astronomers can collect
arrives in the form of this electromagnetic radiation. Astronomers have developed a wide range
of different types of technology to detect light emitted across the full wavelength range2. In the
Stars section of the course we will focus almost entirely on the visible range of the spectrum
which spans a wavelength range from λ ∼ 400 nm (violet) to λ ∼ 800 (red) nm.

Figure 3. The electromagnetic spectrum from radio waves to gamma-rays with their characteristic
wavelengths ranging from λ = 10−12 m for high energy gamma rays up to λ = 103 m for radio waves.
Some examples of the space-based technology that is used to detect each type of light is also
shown. Image from JWST/NASA

Exercise 1. Estimate the energy (in Joules J) of a photon emitted by your microwave oven, a
photon emitted by the Sun and a photon detected by your long-wave radio tuned to Radio 4 198
LW.

Hint: First work out the wavelength of each type of light (in units of metres m) and then calculate
the photon energy using equation 1.2.

1.3. Spectra: evidence for hydrogen

A spectrum is a measurement of the intensity3 of light I detected at each wavelength λ. Figure 4


shows three sketches of spectra in the visible range. Example A shows a peak in the intensity of
the light at around λ = 450 nm. We would see this as blue light.
1
In the classical description momentum = mass x velocity
2
If you would like to learn more about the technology behind astronomical discoveries, you can take our online
AstroTech course on Coursera: https://www.coursera.org/course/astrotech
3
Intensity is a measurement of the power per unit area per unit frequency i.e Wm−2 Hz−1 .
16 INTRODUCTORY ASTROPHYSICS

A B C

I I I

400 450 495 570 590 620 λ (nm) 400 450 495 570 590 620 λ (nm) 400 450 495 570 590 620

Figure 4. Three sketches of spectra in the visible wavelength range showing the intensity I of the
light at each wavelength λ.

Exercise 2. What colour light would we observe from the spectra shown in Figure 4. Sketch
spectra for different colours. What would the spectrum of a black and a white object look like?

Measuring the spectrum of light emitted from astronomical objects can tell us a treasure-trove of
information about the different physical processes that are occurring in the Universe. Figure 5
shows the spectrum of the light measured from the Sun. We can see that the intensity of the light
peaks in the visible range of the spectrum and that there are also a series of abrupt downwards
spikes. These downwards spikes were first discovered in1814 by Joseph Fraunhofer and are
therefore often named after him as Fraunhofer lines. To understand the profound significance
of these spikes in the spectrum we need to first build a model of an atom.

1.3.1. The atom

At the centre of each atom is a small nucleus made up of positively charged protons and neutral
neutrons. The nucleus contains almost all of the mass in an atom, but the size is governed by
its outer shell which consists of negatively charged electrons in orbits around the nucleus. As
seen in Figure 1, there is an attractive electric force field between oppositely charged particles.
So why do the negatively charged electrons remain in orbits around the positive nucleus? Surely
they should simply spiral into the centre of the atom?

In 1913 Niels Bohr proposed a new model of the atom to address this puzzle. Bohr postulated
that there are only a fixed number of circular orbits in which an electron is stable. In 1924, Louis
de Broglie calculated the radius of those orbits for a hydrogen atom, and we will now repeat his
groundbreaking PhD thesis proof.

Quantum physics tells us that a particle can act like a wave (and light can act like a particle,
see section 1.2). Let’s therefore consider the electron orbiting the nucleus as an electromagnetic
wave (it is a moving charged particle so will be a source of both the electric field and the magnetic
field). For the orbit to be stable, the circumference of the orbit must be an integer number times
the wavelength (this is easy to understand looking at the diagram in Figure 6). We can therefore
1: THE SUN 17

1
2
]
Hz
Intensity [kWm

Wavelength [µm]
tensity [W/m2 ]
Figure 5. The spectrum of light from our Sun. We can see that the intensity of light peaks in the
visible range of the spectrum. There are also a series of abrupt downwards spikes which are called
Fraunhofer lines or absorption lines. Image adapted from Physics, CSBJSU.

Figure 6. The Bohr-de-Broglie interpretation of quantised spacings for circular orbits. If the number
of wavelengths λe in the circumference of the orbit of the electron is not equal to an integer n (left
diagram) then that orbit is not allowed. The only allowed circular orbits have an integral number
(like 5 in the right diagram) of wavelengths in the circumference. Image and caption from Shu
(1981)

write
(1.4) 2πr = nλe
where r is the radius of the orbit, n is an integer number and λe is the wavelength of the electron.
We can use equation 1.3 to relate the wavelength of the electron λe to its momentum p = h/λe .

As quantum theory says we can switch between a wave and a particle description of matter as
we choose, let’s now flip back to thinking about the electron as a particle. It’s orbiting the nucleus
18 INTRODUCTORY ASTROPHYSICS

at a speed ve , and therefore has momentum p = me ve , where me is the electron mass which is
me = 9.11 × 10−31 kg.
Exercise 3. By re-arranging equation 1.4 and equation 1.3 and equating the two formula for λe ,
show that the electrons orbital velocity is given by
hn
(1.5) ve =
2πme r

The orbiting electron will experience an inward attractive force towards the proton called the
Coulomb Force. The Coulomb Force between two charges q1 and q2 , separated by a distance
r is given by
|q1 q2 |
(1.6) Fc = kc
r2
where kc is Coulomb’s constant kc = 9 × 109 N m2 C−2 . As the electron is orbiting it will experience
a centripetal force which for any object of mass m moving in a circle of radius r with speed v as
given by
mv 2
(1.7) F = .
r
As the centripetal force is supplied by the Coulomb Force, we can write
me ve2 e2
(1.8) = kc 2
r r
where we have included the electron charge q1 which is equal but opposite in sign to the proton
charge q2 as e = 1.6 × 10−19 C, where C is the unit of charge called a Coulomb.
Exercise 4. Show that by combining equations 1.5 and 1.8, the radius of the electron orbits
around a hydrogen atom are given by
h2 n2
(1.9) r=
4π 2 kc me e2
Exercise 5. Calculate the radius of the electrons orbit in a Hydrogen atom when n = 1. This is
known as the Bohr radius of an atom and it tells us the smallest possible size for our Hydrogen
atom!

Clever Student Question No 1: “That’s great, we’ve taken the basic rules of quantum physics
and then used some simple physics and maths to calculate the radius of an atom. Amazing. But
aren’t you forgetting something really blindingly obvious here? Atoms are not circles, they are
spheres!”

This is indeed one of the biggest shortcomings of the Bohr-de-Broglie model, but this model does
explain the basic properties of the spectra that we observe from the Sun. Do take a quantum
physics course in future years through to derive the full 3D solution to the model atom.

1.3.2. Hydrogen: absorbing and emitting light

If you feed a small child large quantities of chocolate, it will start jumping around. Electrons
are pretty similar to small children. Feed them with energy, and they’ll start jumping around the
different orbits in the atom.
1: THE SUN 19

Let’s think about our electron orbiting the proton at the Bohr radius. This is called the ground
state and is the lowest, most stable orbit for the electron. The electron will have two types of
energy; kinetic energy
1
(1.10) EKE = mv 2 ,
2
and electrostatic potential energy, which in this case comes from the Coulomb force and is given
by
e2
(1.11) EPE = −kc ,
r
Exercise 6. Adding the contributions of both the kinetic and potential energy, calculate the total
energy of the electron in the ground-state of the hydrogen atom.

Hint: You’ll need to use equations 1.8 and 1.9, and you should find En=1 = −13.6eV where
1eV = 1.6 × 10−19 J.

You can conclude from exercise 6 that if you feed a hydrogen atom with a photon of light with
exactly 13.6eV of energy, the electron in its ground state (n = 1) will be released from its bound
orbit and travel off into space. The hydrogen atom will then be ionised. Indeed there are
a series of discrete energies for photons that are particularly appetising for hydrogen atoms.
These photon energies correspond to the energy differences between different electron orbits.
The energy difference between an electron moving from orbit n to orbit m is given by the Rydberg
formula
 
1 1
(1.12) E(m) − E(n) = 13.6eV − .
n2 m2
You should be able to derive this by extending your derivation in exercise 6. For any combination
of m and n you can therefore determine the energy of the photons that are absorbed (m > n)
by the hydrogen atom. When electrons move from higher orbits down to lower orbits, i.e m < n
then photons are emitted (see Figure 7).
Exercise 7. Use the Rydberg formula in equation 1.12 to calculate the wavelength of the light
that is absorbed and emitted by the Hydrogen atom shown in Figure 7. Also calculate the light
absorbed with a transition from n = 2 to m = 3 (this is called the Hα line), and from n = 2 to
m = 6 (this is called the Hδ line). Can you identify any of these absorption features in the Sun’s
spectrum shown in Figure 5?

1.3.3. Summary

In this section we first looked at the spectrum of light from the Sun in Figure 5 and noticed
that it had sharp abrupt downwards spikes. We then developed a model of the Hydrogen atom
in section 1.3.1 and predicted that there will be a set of different photon wavelengths that will
be absorbed by neutral Hydrogen atoms. We then calculated what some of these wavelengths
would be, in exercise 7, and saw that they matched the downwards spikes in the Sun’s spectrum.
From this result we can draw two very important conclusions

• There is Hydrogen in the Sun


20 INTRODUCTORY ASTROPHYSICS

Photon
Absorption

n=1 m=2

Photon
Emission
n=2 m=1

Figure 7. Cartoon sketch of the Bohr model of a Hydrogen Atom, showing the effect on the electron
orbit when a photon is absorbed (upper), or emitted (lower). Image adapted from Physics, UTK.

• The Sun must have outer envelope of neutral Hydrogen which absorbs light from a hot
core which is emitting it.

and what is exciting is that we can draw these two conclusions without travelling to the Sun to
take a sample! In the next section we’ll ask the question what is it like in the centre of the Sun!

1.4. Gravity vs Pressure

As we start to uncover the interior workings of stars and look at how stars evolve, we will come
back time and time again to the theme of gravity vs pressure. There is a continuous balancing
act occurring between these two forces and when one force wins over the other we will see a
change in the appearance of our star. In this section we will first look at gravity, then pressure
and then the finely balanced hydrostatic equilibrium between these two forces that exists for the
majority of a star’s lifetime.

1.4.1. Gravity

The gravitational force on a particle of mass m, from a particle of mass M , separated by a


distance R is given by
GM m
(1.13) F =
R2
where the gravitational constant G = 6.67 × 10−11 m3 kg−1 s−2 .
1: THE SUN 21

When we consider massive objects that are larger than a particle, such as the Sun, we can
use Birkhoff’s theorem which shows that the gravitational field of a spherical object of mass M
appears to act from a concentrated point mass M located at the centre of the sphere.

In the first teaching studio problem solving workshop you will calculate that the Sun has a mass
M = 2 × 1030 kg and a radius of R = 7 × 108 m.
Exercise 8. Calculate the gravitational force that a hydrogen atom will experience on the surface
of the Sun. If there was no force opposing gravity, calculate how long would it take for the Sun
to collapse? For simplicity take the acceleration experienced by each particle in the Sun to be
2
constant with a value a = GM /R .

Hint: s = ut + 12 at2 , where s is the distance, u is the , t is the time and a is the acceleration.

If there was no force opposing gravity, you will find that the Sun would collapse in well under
an hour. Indeed it would collapse even faster than your calculation in exercise 8 finds because
the acceleration will increase as the Sun shrinks in size. Clearly something is preventing this
collapse, and that something is pressure.

1.4.2. Pressure and hydrostatic equilibrium

The Sun is in hydrostatic equilibrium such that the inwards gravitational force is perfectly balanced
by an outwards pressure force at every level as you move from the centre to the outer envelope
of the star. When the forces are balanced in this way no resultant force acts on any element of
the gas and the Sun remains mechanically steady (as observed).

We can use this information to calculate the central pressure in the core of the Sun. Let’s
consider a very small cylinder at radius r from the centre of the Sun, as shown in Figure 8.
The outwards net pressure force on the cylinder is given by Fpressure = ∆P ∆A where ∆P is the
difference in pressure between the top and the bottom of the of the cylinder. We’re going to
assume that the density of the star is constant throughout and is given by
Mass M
(1.14) ρ= = 4 3 ,
Volume 3
πR
such that the mass of the cylinder ∆m = ρ∆r∆A where ∆r and ∆A are the height and area of
the cylinder as shown in Figure 8.

In exercise 8 you calculated the gravitational force experienced by an atom on the surface of
the Sun. But what about the gravitational force experienced for our cylinder which lies within the
Sun? Will the mass above it exert a gravitational pull in the opposite direction to the mass below
it?

Figure 9 shows a sketch of a small object (in pink) a distance r from the centre of the Sun. It’s
clear that it will experience a gravitational force from the mass contained within the yellow area.
But will it experience any gravitational pull from the mass in the outer white shell? Let’s consider
small regions of mass in this outer ring, as shown in blue, marked a and b.
Exercise 9. Show that for very small angles of θ, the mass in region a is given by ma = π[ 12 (R +
r)θ]2 ρh and the mass in region b is given by mb = π[ 12 (R − r)θ]2 ρh. Show that the gravitational
22 INTRODUCTORY ASTROPHYSICS

Fpressure = [P (r + r) P (r)] A

A P (r + r)

Density = ⇢
r
Volume = r A
Mass = m = ⇢ r A
P (r)

r GM (< r) m
Fgravity =
r2
Density
P (r + =r)
A⇢
Figure 8. Sketch of a small cylinder a distance r from the centre of the Sun, with height ∆r and
surface area of the upper or lower circle of ∆A. The density of the cylinder is ρ, giving the mass
of the cylinder ∆m = ρ∆r∆A. The gravity and pressure forces on the cylinder are balanced in
hydrostatic equilibrium.

h
b

𝞱
-r
R

𝞱
r
+

h
R

Figure 9. Sketch to calculate the gravitational forces on a particle inside the Sun (shown pink). The
gravitational forces from blue areas a and b are equal and opposite. There is therefore zero net
gravitational force on the particle from the mass in the outer white shell.
1: THE SUN 23

forces from region a and region b are therefore equal and opposite in size such that they cancel
out.

Hint: For small angles, tan θ ≈ θ, and the area of a circle Area = π radius2 .

There is a net zero resultant gravitational force on any point within the Sun from the mass in a
shell above it. This is a secondary result from Birkhoff’s theorem. The force due to gravity on
our cylinder is therefore given by
GM (< r)∆m
(1.15) Fgravity =
r2
where M (< r) is the mass enclosed within a radius of r, given by
M r3
(1.16) M (< r) = ρV = 3
R
Equating the pressure force and gravitational force in hydrostatic equilibrium we have
GM (< r)
(1.17) ∆P ∆A = − ρ∆r∆A .
r2
where the minus sign indicates that the forces act in opposite directions. Re-arranging and
replacing M (< r) with equation 1.16, and ρ with equation 1.14 we find
∆P 3GM 2 r
(1.18) =− 6
∆r 4πR
∆P dP
If we now shrink our cylinder such that its height ∆r is infinitesimally small, we can write ∆r
→ dr
and integrate the above equation
Z 0
3GM 2 R
Z
(1.19) dP = − 6
rdr
Pc 4πR 0

where we’ve included the limits that at r = R , P (R ) = 0 and labelled the core pressure at the
centre, where r = 0, as P (r = 0) = Pc .
Exercise 10. Integrate equation 1.19 to show that the central Pressure of the Sun is given by
3GM 2
(1.20) Pc = 4
.
8πR
x2
R R
Hint: A reminder of some basic calculus: dx = x + c, xdx = 2
+ c.

Let’s use this result to make an ‘order-of-magnitude‘ estimate4 for the central Pressure by rounding
values for the constants
(1.21)
2 × 7 × 10−11 × (2 × 1030 )2 56 × 10(−11+30×2) 1049 10(49−32)
Pc ≈ ≈ ≈ ≈ ≈ 1014 Nm−2
8 × 3 × (7 × 10 )
8 4 24 × 50 × 10
2 (8×4) 24 × 50 × 10 32 1200
and let’s perform a dimensional analysis of the constants to make sure that everything is correct
and that we find units of Pressure as expected
[m]3 [kg]−1 [s]−2 [kg]2
(1.22) Pc [units] → = [kg][m]−1 [s]−2 = [N][m]−2
[m]4
4
An ‘order-of-magnitude‘ estimate is something that you can calculate in your head without having to turn on your
calculator
24 INTRODUCTORY ASTROPHYSICS

where 1N = 1kg m s−2 (which if you couldn’t remember, you could probably work out from
knowing F = ma).

In our analysis we made an approximation that the density ρ is constant throughout the Sun.
This is fine for getting a rough estimate of the central pressure, as we have done, but it’s actually
quite a poor approximation to make. The density at the core of the Sun is significantly higher
than the density in the outer envelope, and if this density variation is included in the analysis,
then the end result is about 100 times higher with Pc = 2.5 × 1016 N m−2 .

1.4.3. Atoms under pressure

In section 1.3 we proved that the Sun had an outer envelope of neutral Hydrogen atoms. It is
reasonable to assume that if the outskirts of the Sun is made of Hydrogen, the interior will also
be made of Hydrogen. But can neutral Hydrogen atoms exist in the high pressure conditions of
the core of the Sun?

First let’s arrange our Hydrogen


atoms in a grid of boxes in the tightest
configuration possible. Each electron
2rb
(blue) is in its ground state, orbiting
the proton (orange) with a radius
given by the Bohr radius calculated in
exercise 5 where rb = 5.3 × 10−11 m.

2rb
2rb

Exercise 11. Use equation 1.6 to calculate the Coulomb force that binds together the proton and
electron in each neutral Hydrogen atom in our grid. If we placed our grid of atoms at the core of
the Sun, where the central pressure Pc = 2.5 × 1016 N m−2 , calculate the force due to pressure
that each atom would experience.

Hint: Consider the pressure on each side of the box grid and use the equation; Force = Pressure
x Area.

The high pressure at the centre of the Sun pushes atoms together so hard that their electron
shells could not survive. We can therefore conclude that the centre of the Sun is composed of
bare nuclei and free electrons. This is a state of matter that we call plasma.

1.4.4. Central Temperature

Without the need for a thermometer and a rocket trip to the core of the Sun, we are now going
to use some simple physics to determine the temperature at the centre of our Sun!
1: THE SUN 25

In the previous section we calculated the central pressure and concluded that this was sufficiently
high that there would be a plasma of nuclei and electrons in the core of the Sun. In the conditions
at the centre of the Sun, this plasma acts like an ideal gas. This means that it is in thermodynamic
equilibrium (all the same temperature), that the nuclei and electrons are moving in random
directions and that the distance between the particles is much larger than the size of the particles
themselves5. The ideal gas law relates pressure P and temperature T and is given by
(1.23) P = nkT
where n is the number density of particles (i.e the total number of particles divided by the volume,
or the mass density divided by the average mass of the particles), and k = 1.38 × 10−23 J K−1 is
Boltzmann’s constant.
Exercise 12. Use the ideal gas law (equation 1.23) and your ‘order-of-magnitude’ estimate of
the central pressure in exercise 10 to calculate the temperature in the centre of the Sun. You
can estimate the number density of particles n by making the approximation that the density ρ
is constant throughout the Sun (see equation 1.14) and that the Sun is made entirely of equal
numbers of protons and electrons.

Hint: The number density n = ρ/m̄ where m̄ is the average mass of a particle.

Taking into account the fact that the Sun contains more than just Hydrogen, and modelling the
variation in the density from the core to the outer envelope a more detailed calculation yields a
central temperature of T = 1.57 × 107 K.

1.4.5. Summary

In this section we concluded that the forces of gravity must be perfectly balanced by pressure
forces in hydrostatic equilibrium for the Sun to remain stable. We then calculated the central
pressure at the core of the Sun to conclude that in such high-pressure regions atoms must be
stripped of their electrons and a plasma formed. We then used the ideal gas law to calculate
the temperature of the plasma at the core of our Sun. From these results we can draw two very
important conclusions

• There is a very hot, high pressure plasma of protons and electrons in the core of our
Sun
• The Sun is structured with an outer envelope of neutral Hydrogen, and an inner plasma
core.

In the next section we’ll ask the question what chemical processes occur in the high temperature,
high pressure region in the core of the Sun!

1.5. The central engine: nuclear fusion

Inside the core of the Sun there is a “scalding inferno, with a hurly-burly of particles and photons
rushing about madly, smashing violently into each other” (Shu 1981). If we smash two protons
5
As we look at the evolution of stars we’ll find that in old age, this particular criteria doesn’t hold any more such that
the plasma is no longer a perfect gas. For our young Sun however, this is a very good approximation to make.
26 INTRODUCTORY ASTROPHYSICS

together, what will happen? We have already discussed the electrostatic Coulomb force in
section 1.3.1 which is a long-range force. This means that it acts over long distances with a
Force (given in equation 1.6) that increases as the particle separation decreases as 1/r2 . For
two protons of the same charge the Coulomb force is a repulsive force which will increase to ∞
as r → 0. From this we should conclude that two protons would never touch as the Coulomb
Force would throw them apart as they get close to each other. But... (in an vain attempt for you
Star Wars fans to remember this)... “there is another”... and “the (nuclear) force is strong with
this one”.....

Long-range electrostatic E
Coulomb Force:
Repulsive

+ +
Quantum tunneling allows
some of the protons to break
through the Coulomb barrier

Short-range Strong r
Nuclear Force:
Attractive

Figure 10. Two colliding protons (in green) and a schematic of the energy levels required for them
to overcome the repulsive Coulomb force in order to become bound at small separations by the
attractive Strong Nuclear Force.

Figure 10 shows a schematic of the energy levels that are required to be exceeded at large
separation by two colliding protons in order to overcome the repulsive Coulomb force. As the
separation r decreases, the repulsive Coulomb force increases, but as r decreases still further,
eventually the attractive Strong Nuclear Force overcomes the repulsive Coulomb force and the
protons are attracted together.

The core of the Sun is incredibly hot and the protons have a very high energy, but the Coulomb
force is very strong. They therefore need a little quantum magic, in the form of quantum
tunnelling6 in order to break through the Coulomb barrier.
6
In section 1.2 we discussed that particles can act like light waves (and light can act like particles). If we think of the
protons as waves as they approach each other, then the energy of those waves will decrease exponentially as they
1: THE SUN 27

We have just described the first stage of a nuclear fusion reaction that powers our Sun.

1.5.1. Nuclear Physics: Notation

Before we proceed we need to learn some notation to describe the different particles at the core
of our Sun.

Each chemical element in the periodic table can be described by its

• Mass Number A: the number of protons plus neutrons


• Atomic Number Z: the number of protons

and these are written as super and sub scripts around the elements symbol as follows
A
(1.24) Z Element

Exercise 13. How many protons and neutrons are there in Helium 42 He and Hydrogen 11 H?

It is the atomic number (the number of protons) that determines which element the particle
is. For example there are different isotopes of Helium. Helium-3 32 He and Helium-4 42 He. Both
particles are Helium, because they have Z = 2 protons, but Helium-3 has an one less neutron
in comparison to the more stable Helium-4, giving it a mass number A = 3 instead of the usual
mass number A = 4.

When we look at nuclear fusion processes, in addition to protons and neutrons, there will be
electrons e− , positrons e+ , neutrinos ν and photons γ produced. Einstein’s famous Relativity
equation of mass-energy equivalence shows us where the photons come from;

(1.25) E = mc2 .
In a fusion reaction the total mass of the fused particles at the end of the reaction is slightly less
than the mass of the original particles. The missing mass is converted into photons which carry
away energy from the reaction. This is the light that we see from the Sun.

1.5.2. The Proton-Proton Chain

Once the two colliding Hydrogen protons overcome the repulsive Coulomb force (as sketched
out in Figure 10), a nuclear fusion process starts. This is called the proton-proton chain, or the
p-p chain for short.

The p-p chain has three steps which can be written symbolically as follows

1
(1.26) 1H +11 H → e+ + ν +21 H

travel through the Coulomb barrier. In this picture, there will remain a non-zero probability of the protons still existing
as they break through the other side of the Coulomb barrier and feel the short-range attractive strong nuclear force.
28 INTRODUCTORY ASTROPHYSICS

2
(1.27) 1H +11 H →32 He + γ

3
(1.28) 2 He +32 He →42 He +11 H +11 H

Putting the above three steps together gives

(1.29) 411 H →42 He + 2e+ + 2ν + 2γ


Exercise 14. Show that the three steps of the p-p chain conserve charge, i.e that there is the
same amount of positive charge on each side of the reaction.
Exercise 15. For each p-p chain reaction, use Einstein’s famous equation 1.25 to calculate how
much energy is produced. You will need to calculate the change in mass ∆m between the start
and the end of the fusion reaction.

Hint: The starting mass is 4M(H), where M(H) is the mass of the Hydrogen nuclei (a proton).
The end mass is M(He) + 2M(e+ ) + 2M(ν). Neutrinos have very very low mass so we can ignore
them.

Using equation 1.2 to relate the energy of each photon released in the reaction to its wavelength,
you’ll find that the p-p chain produces very high energy gamma rays with wavelengths
hc 6 × 10−34 × 3 × 108
(1.30) λ= ≈ = 6 × 10−14 m
E 3 × 10−12
If these gamma rays travelled directly out of the Sun towards Earth it would only take
DEarth−Sun
(1.31) t= ≈ 8 minutes
c
to reach us. Whilst our atmosphere is very good at protecting us from harmful ultra-violet light, it
would not protect us from gamma-rays. In the next section we therefore ask what is happening
inside the Sun to convert the gamma-rays created by the nuclear fusion in the core of our Sun,
into the friendly optical light that we see?

1.5.3. From the Sun’s core to the outer envelope: Radiative Transfer

In the hot, dense, high pressure plasma in the core, a photon will find particles continually getting
in its way as it attempts to exit the star. Described as a ‘crazy dance’ or a ‘random walk’, as
Edinburgh University students, you might like to consider the poor photon akin to yourself as you
try to leave the crowded Teviot dance floor after one too many.

The photon (student) starts at the centre of the star (dance floor). It has equal probability of going
left or right. It can move a distance l before colliding with a particle (another student), at which
point it has another equal probability of going left or right, before colliding with another particle
(student). Now consider a group of photons at the centre of the star (a group of drunk students
at the centre of the dance floor). After they have all made N such moves, the average position of
the photons (students) will still be at the centre of the star (dance floor). Just as many photons
1: THE SUN 29

(students) have gone to the right as have gone to the left. But each individual photon (student)
will have made some erratic progress towards exiting the star (dance floor), just in completely
opposite directions.

Let xN be the position of the photon (student) measured from the centre of the star (dance floor)
after N collisions. Between each collision they move a distance l. The average position over all
the photons (students) in the group, after N collisions hxN i = 0. Let’s consider the square of the
position after N + 1 random steps. This gives us a measure of the displacement from the centre
of the star (dance floor) independent of the direction that the photon (student) has moved. For
each photon (student) the position after N + 1 collisions is
(1.32) xN +1 = xN + l [50% probability]
(1.33) = xN − l [50% probability]
Taking the average of the displacement, over our group of photons (students)
1 1
(1.34) h(xN +1 )2 i = h(xN + l)2 i + h(xN − l)2 i = hx2N i + l2
2 2
The first move was a distance l, so we can write x21 = l2 . We can then use equation 1.34 to see
that the second move gets our photons (students) to an average displacement from the centre
of the star (dance floor) of h(x2 )2 i = hx21 i + l2 = 2l2 , the third move to h(x3 )2 i = hx22 i + l2 = 3l2 etc
such that we can write
(1.35) h(xN )2 i = N l2
We can now calculate how far our group of drunken students will need to dance in order to exit
the dance floor, covering a net distance of 10m, in random moves, each of size l = 0.5m, before
colliding with another student. This would require, on average, N = 102 /0.52 = 400 moves. This
means our poor drunken students will have to dance, on average, a total distance of N l = 200m,
in random directions, before finally being able to exit the dance floor7.

It’s even worse for our photons. To cover the distance from the core of the Sun to the outer
2
envelope, a net distance R , the photons experience, N = R /l2 collisions. The distance
between collisions is called the mean free path and for the Sun is l ∼ 0.5cm. You could
estimate this knowing a typical particle ‘cross-section’ (size), and calculating the number density
of particles in the Sun. In detail, however, this number varies significantly as you move from the
very dense core towards the outer envelope of the star.
Exercise 16. Calculate the number of collisions experienced by a photon as it travels from the
core to exit the Sun. Assume the distance between collisions is l ∼ 0.5cm. Calculate the time it
takes for a photon to exit the Sun.

Hint: Photons travel at the speed of light, even in a super hot dense plasma.

In its tortuous journey from core to surface, the photon loses energy each time it collides and
interacts with the matter. Indeed, it is only thanks to this ‘crazy dance’, more scientifically referred
to as radiative transfer, that the Earth is not irradiated by the gamma ray photons that are
produced by the p-p chain fusion reaction at the core of the Sun. By the time the photons are
released at the surface of the Sun, they have lost a sufficient amount of energy so that the
majority of them are emitted in harmless optical wavelengths.
7
Catherine and Andy challenge you all to test out this radiative transfer theory on the next crowed dance floor you
find yourself upon!
30 INTRODUCTORY ASTROPHYSICS

Clever Student Question No 2: “That’s great, we’ve looked at a drunken student colliding with
other students on a dance floor to understand how photons travel out of the Sun and discovered
that it takes them ∼ 105 years to travel through the Sun (where the extra factor of 10 comes from
the fact that l varies as the photon exits the Sun), even though they are travelling at the speed of
light, and that it is this very dance that saves us from gamma-ray-death. Amazing. But aren’t you
forgetting something really blindingly obvious here? The photons in the Sun and the students on
the dance floor move in more than just the one x dimension that we considered. The Sun is a
sphere so we need to consider how the photons move in both the x, y and z directions.”

This is indeed a correct observation. In order to calculate this correctly you need to consider that
the photon undertakes a three-dimensional walk. If you factor in the extra dimensions, this just
gives us an extra factor of 3 in the front of equation 1.35.

1.5.4. Summary

In this section we discovered that the temperature at the core of the Sun was sufficient to start a
nuclear fusion process, called the p-p chain. This reaction fuses Hydrogen together to produce
Helium, light, positrons and neutrinos. We calculated the energy of the photons produced by this
reaction and realised that something must happen to these photons as they travel from the core
to the edge of the Sun, as if not, the Earth would be irradiated with gamma-rays. We concluded
that photons energy would be reduced by the many collisions that it would experience on its
journey to exit the Sun.

In the next section we’ll ask the question, knowing that the Sun emits optical light, can we
calculate the temperature in the outer layers of the Sun?

1.6. The outer layers of the Sun

When any material has a temperature, collisions will occur between the particles in the material
and these collisions will cause some of the particles to accelerate. If you can remember from
section 1.2, a moving charged particle creates an electromagnetic field. And an oscillating
electromagnetic field, is light! For a material at a single fixed temperature, there will be a
wide range of different energy collisions between the particles, and so the energy of the light
emitted will also span a wide range of energies. In 1900, Max Planck derived a relationship for
this thermal radiation that linked the temperature T of an idealised material in perfect thermal
equilibrium (called a blackbody) to the intensity of light, emitted at each frequency f (see
equation 1.1 to link frequency and wavelength). This has become known as Planck’s Law8
of the blackbody spectrum.
−1
2πhf 3
  
hf
(1.36) I(f, T ) = exp −1 Planck’s law
c2 kT
where k is Boltzmann’s constant, h is Planck’s constant and I(f, T ) is the intensity of light emitted
at each frequency f for a material at temperature T . Look back to section 1.3 on spectra to
remind yourself that intensity is the energy emitted per second, per unit area, per unit frequency
and has units of Wm− 2Hz−1 .
8
Do take a course in thermodynamics if you want to derive Planck’s Law. For clarity, you do not need to memorise
the Planck’s Law equation for this course!
1: THE SUN 31

Astronomers use the term Luminosity L to describe the total amount of power that is emitted by
an astronomical object over all frequencies. To calculate this we first need to integrate Planck’s
Law, to sum up the intensity contribution at each frequency
Z ∞ Z ∞ −1
2πhf 3
  
hf
(1.37) I(T ) = I(f, T )df = exp −1 df
0 0 c2 kT
hf
To make our life easier, let’s substitute x = kT
and change df = dx kT
h
so we can write the above
equation as
4 Z ∞
x3

2πh kT
(1.38) I(T ) = 2 dx
c h 0 ex − 1
R∞ x3
and if we’re feeling lazy, we can just look up the integral on Wolfram to find 0 ex −1
dx = π 4 /15.
(1.39) I(T ) = σT 4
where σ is the Stefan-Boltzmann constant (named after the people who first derived this), and
2k 4 π 5
(1.40) σ= = 5.67 × 10−8 J s−1 m−2 K−4
15c2 h3
Equation 1.39 gives us the the energy emitted per second, per unit area. The Sun is a sphere
2
with stellar radius R and surface area 4πR . The Sun’s luminosity L is then given by the
intensity times the surface area, and hence is related to the surface temperature Te of the Sun
through
2
(1.41) L = 4πR σ Te4 . Stefan-Boltzmann law

Exercise 17. The Luminosity of the Sun (the total amount of energy emitted per second) is
measured to be L = 4 × 1026 W. Use the Stefan-Boltzmann law to calculate the surface
temperature of the Sun.

1.6.1. The Colour of the Sun

In this section we’ll use Planck’s law to tell us the colour of the Sun. And before you all cry, “it is
yellow” and skip this section, let me warn you that your eyes are deceiving you......

Figure 11 shows the blackbody spectrum predicted by Planck’s law for objects at a range of
different temperatures. You’ll see that the spectrum has a characteristic shape; it rises at low
frequencies, peaks, and then drops again at high frequencies. The peak of the curve tells us the
frequency and wavelength with which the majority of the light is emitted, and hence the colour
that the object appears.

In order to derive a relationship for the peak wavelength of the blackbody spectrum as a function
of temperature, we need a reminder of some basic calculus. The maximum of a function x(y) is
found when the gradient of that function is zero, i.e when dx dy
= 0. So the peak frequency of a
dI
blackbody spectrum I will be found when df = 0. It is tricky to differentiate I in equation 1.36, so
let’s make our lives easier by only considering cases when the frequency is large such that
 
hf
(1.42) exp >> 1
kT
32 INTRODUCTORY ASTROPHYSICS

1
2
]
Hz
Log I(f, T )[Wm

Radio IR UV 𝞦ray, ɣray

f = c/ [Hz]

Figure 11. Blackbody Spectrum predicted by Planck’s law, for a range of temperatures. Image
adapted from NJIT Radio Astronomy Lectures.

such that we can re-write equation 1.36 as


2πhf 3
 
hf
(1.43) I(f, T ) ≈ exp − (‘Wien limit’ for large f)
c2 kT
Exercise 18. Differentiate the intensity I in equation 1.43 with respect to frequency f and set
that equal to zero to show that the peak frequency of light emitted is given by fmax = 3kT /h.

Hint: A reminder of some calculus:

dx dv
• If x = uv, where u and v are functions of y, then dy
= u dy + v du
dy
.
• If x = eay , where a is a constant, then dx
dy
= aeay .

We can convert the peak frequency to a peak wavelength using equation 1.1 to find
hc 7 × 10−34 × 3 × 108 [m] [K]
(1.44) λmax T = ≈ −23
[J][s] = 7 × 10−3 m K
3k 3 × 10 [s] [J]
where we’re using ‘order-of-magnitude‘ values for the constants, and performed a dimensional
analysis of the constants to make sure that it is correct (and indeed we find units of length and
temperature as expected).

If you do the full differentiation of Planck’s law, without the high frequency approximation that we
made, you’ll find that our estimate over-predicted by roughly a factor of 2. The exact relationship
1: THE SUN 33

is given by Wien’s law.

(1.45) λmax T = 2.9 × 10−3 m K Wien’s law


Exercise 19. Use Wien’s law to calculate the wavelength of the light that the Sun emits the most
of. You’ll need to use the surface temperature of the Sun that you calculated in exercise 17.
Use the colour scale in Figure 4 to then determine what colour the Sun is. Compare the peak
wavelength that you calculate to the data of the Sun’s spectrum shown in Figure 5. Do they
agree?

Clever Student Question No 3: “That’s great, we’ve used Planck’s Law and a measurement of
the energy emitted from the Sun to calculate the Sun’s surface temperature. And then we’ve used
the calculus that we learnt at high school to predict the colour of the Sun from that temperature.
Amazing. But something is clearly wrong here because this predicts that the Sun is green - and
it’s really blindingly obvious that it’s yellow!!!!”

This is indeed a correct observation that, to our eyes, the Sun appears to be yellow/white. Our
eyes are deceiving us though because they are not perfect detectors of the different colours in
the optical part of the spectrum. Here is a link to an excellent 3 minute video which explains the
biology of our eyes:

http://www.spitzer.caltech.edu/video-audio/150-ask2008-002-Why-Aren-t-There-Any-Green-Stars

1.7. Solar Neutrinos

In this final section before we leave the topic of our Sun and move on to look at stars in general,
we will look at one final piece of evidence to support the solar theory that we’ve outlined in this
chapter;

• The Sun has a core made up of a high pressure plasma of protons and electrons at a
temperature T = 1.57 × 107 K.
• The Sun has an outer envelope of neutral Hydrogen with a surface temperature of T =
5770K.
• The temperature at the core of the Sun is sufficient to start a nuclear fusion process,
called the p-p chain which fuses Hydrogen together to produce Helium.
• The energy produced in this reaction is emitted as gamma-rays which loose energy from
repeated collisions with particles inside the Sun as the photons leave the Sun, and are
finally emitted as thermal blackbody radiation mainly in the optical wavelengths.

One by-product of the p-p chain reaction in the core of the Sun, is neutrinos. These very low
mass particles exit the Sun and are detected on Earth. Detailed modelling of the core of the Sun
predicted how many neutrinos should be detected on Earth, but for a long time only a third of the
number predicted were detected. For a long time the particle physicists (who were detecting the
neutrinos) blamed the astronomers for having their solar theory wrong. Astronomers conversely
blamed the particle physicists for having faulty detectors. In the end, it was the Astronomers that
were right, but not because the neutrino detectors were faulty. Instead it was because of the
profound discovery that neutrinos have a multiple personality disorder. In their journey from the
Sun to Earth, they can change their appearance from electron neutrinos (as they are produced in
34 INTRODUCTORY ASTROPHYSICS

the p-p chain), to Muon and Tau neutrinos, which the particle physicists were not trying to detect.
This is why only a third of the neutrinos predicted were detected, because the other two-thirds
had changed into Muon neutrinos and Tau neutrinos in equal numbers. The confirmation of
solar theory via the detection of precisely the right number of neutrinos is a major achievement
for stellar astrophysics (and indeed ended up with a Nobel Prize).
Exercise 20. In the final exercise of this chapter, can you calculate how many solar neutrinos
are travelling through your thumbnail every second? Neutrinos are emitted at a rate of 2 × 1038
neutrinos per second and they free-stream away from the Sun in all directions.

Hint: Consider a quick ‘order-of-magnitude‘ estimate by approximating the surface area of the
sphere over which the neutrinos are distributed as 4πr2 ≈ 10r2 , where r is the distance between
the Sun and your thumb on the Earth which is 1AU = 1.5 × 1011 m.
2: STAR BIRTH AND EVOLUTION

2.1. Outline of content


• The Hertzsprung-Russell Diagram of stellar evolution
• Star Birth
• The Main Sequence
• Red Giants and Horizontal Branch Stars

2.2. The Hertzsprung-Russell diagram of stellar evolution

If a zoologist wishes to learn about the full life cycle of an amazonian poison dart frog, they only
need to observe a single frog for the entirety of its life, which is roughly 15 years. As Astronomers,
wishing to learn about the life cycle of a star, we sadly don’t have the luxury of being able to
observe a full life cycle. In our own lifetime we will see no difference in the appearance of our
Sun. We therefore need to use observations of millions of stars, each at different phases in their
lives, in order to deduce how our Sun was born, and what will happen to it in the distant future.

For each star we can use a telescope to measure two basic properties; its luminosity L (how
bright it is) and its colour λmax which we can convert to a surface temperature Te using
Wien’s Law (equation 1.45). In 1910, Ejnar Hertzsprung and Henry Norris Russell made a
plot of these two quantities, L vs Te , for a large sample of stars. They discovered that stars
did not have a random distribution of surface temperatures and luminosities, nor that stars of
a certain luminosity always had the same surface temperature. Instead they found that stars
tended to group together in different locations in the diagram i.e, that there were a series
of different but preferred combinations of surface temperature and luminosity. Since this first
Hertzsprung-Russell diagram was made, the theory of stellar evolution has developed. We’ve
learnt that the locations of the different groups of stars on the diagram correspond to different
physical processes occurring inside the core and outer envelope as the star ages.

Figure 12 shows a Hertzsprung-Russell diagram of stars in our solar neighbourhood. Amateur


astronomers in the audience will recognise the names of some of the stars plotted, including
Betelgeuse which is the left shoulder of Orion (see figure 13) and Polaris, which is also known
as the ‘North Star’. The majority of stars lie along a band called the Main Sequence, and indeed
35
36 INTRODUCTORY ASTROPHYSICS

this is where we find our own Sun. Some other stars are found at brighter luminosities and redder
colours and these are called the Giants. We also see some low luminosity blue stars called White
Dwarfs. In this section we will look at the theory behind the different physical processes occurring
inside stars at different stages of there lives, providing evidence to support those theories using
the Hertzsprung-Russell diagram.

Figure 12. The Hertzsprung-Russell of stars in our solar neighbourhood comparing the measured
Luminosity against the surface temperature or colour. Image taken from Earth Science Reference
Tables.

Exercise 21. Figure 13 shows the locations of Polaris and Betelgeuse. On the next clear night,
take a walk through Holyrood Park (or somewhere similarly dark) and try to locate these two
stars. Does Betelgeuse appear redder than Polaris? Take a census of the colours of all the stars
that you can see and think about where you would plot them on the HR diagram.
2: STAR BIRTH AND EVOLUTION 37

Figure 13. Polaris and Betelgeuse; two stars easily visible in the winter months in Edinburgh.

2.3. Star Birth

Star formation is a very active area of research, and we still do not have a full understanding of
how stars are born. In this section we will therefore use some approximations and simplifications
to gain some understanding of how stars are created from clouds of gas and then discuss the
computer simulations that give us more insight into the very earliest stages of life for our stars.

2.3.1. Gravity vs Pressure - again!

All stars start off life as a clump of gas with gravity causing the gas cloud to collapse. Let’s make
the assumption that our gas cloud is spherical, has constant density ρ and a mass M and radius
R.

We will first calculate the gravitational potential energy, also known as the binding energy of the
gas cloud. Consider a particle of mass m sitting on the edge of the gas cloud. What energy will
we need to move this particle from a radius r measured outwards from the centre of the cloud,
all the way out to infinity?

We can write down the gravitational force binding this particle at any separation r from the
centre of the cloud using equation 1.13. Remember Birkoff’s theorem that the gravitational field
of a spherical object of mass M appears to act from a concentrated point mass M located at the
centre of the sphere (see section 1.4.1).

GM m
(2.1) F =
r2
From high-school physics you should remember that energy is ‘work done’ and that ‘work done
= force x distance’. The ‘work done’ or the gravitational potential energy lost when the particle is
38 INTRODUCTORY ASTROPHYSICS

removed from the cloud and sent out to infinity is therefore given by
Z R Z R
1 GM m
(2.2) U= F dr = GM m 2
dr = −
∞ ∞ r R
We’ve used calculus here because the gravitational force is not a constant, decreasing as we
get further away from the gas cloud.

Instead of removing particles one at a time, let’s now consider an outer shell in the cloud, of
thickness dr and mass dm. What energy will we need to move this shell from a radius r all the
way out to infinity. The mass of this shell dm is
(2.3) dm = ρdV
where ρ is the density of the cloud and dV is the volume of the shell, given by the surface area of
the shell times the thickness; dV = 4πr2 dr. From equation 2.2, the small amount of gravitational
potential energy to remove this small shell is then
GM (< r)dm G 4πr3 ρ 16π 2 Gρ2 4
(2.4) dU = − =− 4πr2 ρdr = − r dr
r r 3 3
Exercise 22. Consider removing a series of shells out to infinity, one by one, until there is nothing
left in the cloud to calculate the total binding energy of the system. You will need to sum up all
the energy contributions dU from each shell dr using calculus.

Hint: Don’t forget that we’re assuming that


R R the density is constant so you can take that out of the
integral. And a basic calculus reminder 0 r dr = R5 /5.
4

You have just proven that the gravitational binding energy of a cloud of mass M and size R is
given by
3 GM 2
(2.5) Egrav = −
5 R
As discussed in section 1.4, there is a constant battle playing out between gravity and pressure
during our stars lifetime, and that battle starts on day one. As our gas cloud collapses under
gravity, the gas pressure increases which causes the temperature of the cloud to increase (see
the ideal gas law, equation 1.23). As the gas cloud collapses the potential binding energy
increases, but so also does the kinetic energy of the particles in the gas.

Each particle at temperature T has an kinetic energy given by9


3kT
(2.6) EKE =
2

If our gas cloud has N particles of mass m, such that the total mass of the cloud is M = N m we
can say that the total thermal kinetic energy of the cloud is
3kT M
(2.7) Ethermal =
2 m
For our gas cloud to collapse, i.e, for gravity to win its battle over pressure
(2.8) − Egrav ≥ 2Ethermal
9
Please do take a course in thermal physics to derive where both this kinetic energy equation, the virial theorem
and the Ideal Gas Law equation come from.
2: STAR BIRTH AND EVOLUTION 39

where the factor of 2 here comes from the virial theorem. Combining equations 2.7 and 2.5 we
find that in order for our gas cloud to collapse to form a star, the ratio between the mass and
radius must be greater than
M 5kT
(2.9) ≥
R Gm

The minimum mass of a cloud that will collapse and form a star is called the Jeans Mass MJ
(anything less massive would not have enough gravity to overcome the pressure). The maximum
radius of a cloud that will collapse and form a star is called the Jeans Radius RJ (anything larger,
and the outer edges would be too weakly bound for the gravity to pull the cloud all together). Both
the quantities are named after Sir James Jeans who derived them in 1902.

We can write both these Jeans quantities in terms of the number density of particles n where
M
(2.10) n = N/V =
m(4π/3)R3
Exercise 23. Use equation 2.9 and 2.10 to eliminate R to show that the Jeans Mass MJ .
r  3
3 5kT 2 1
(2.11) MJ = √
4π G m2 n
Can you also write the Jeans Radius RJ in terms of the number density of particles n, the
temperature T and the mass of the particles m?

You need not memorise these equations, but you should remember that the Jeans Mass increases
with increasing temperature, as increased temperature means increased pressure, and hence
increased gravity and mass required for the gas cloud to collapse. The Jeans mass decreases
with increasing particle number density, because a higher particle density implies that the mass
is distributed over a smaller distance making gravity more effective. The Jeans mass also
decreases with increasing particle mass, as for a fixed mass, a higher particle mass means
a smaller number of particles and hence a lower pressure (as the kinetic energy and pressure
scales with the number of particles, not their mass).
Exercise 24. Use equation 2.9 to make an ‘order-of-magnitude’ estimate for the mass of a
spherical gas cloud of hydrogen atoms that would just collapse under gravity, if the cloud was at
room temperature and would fit in a car (i.e have a radius R ∼ 1m). Check your units using a
dimensional analysis.

Hint: Don’t forget to convert your room temperature from degrees Celsius to Kelvin.

Clever Student Question No 4: “That’s great. We’ve used a balance of kinetic and potential
energy to show that there is a tipping point beyond which a gas cloud will collapse under gravity
to form a star. But aren’t you forgetting something blindingly obvious here. Surely we don’t expect
gas clouds to be perfectly spherical or to have the same temperature and density throughout?”

Indeed, the Jeans Mass prediction is only really useful as a starting point to understand star
formation. There are many factors that complicate this simplified approach. Typically we see
young stars grouped together in star clusters, such as the Pleaides shown in Figure 14. Instead
of our cloud collapsing to form a single star, the theory suggests that ‘clumps’, i.e variations in the
density and temperature within the cloud, cause the cloud to fragment. There are then numerous
40 INTRODUCTORY ASTROPHYSICS

‘Coal-sack’ Pleiades
Figure 14. If the dinosaurs had looked up at Pleiades (shown right) they would have seen a dark
gas cloud much like the coal-sack (shown left). You can see Pleiades (or the seven sisters) in the
winter skies from Edinburgh. You’ll need to go to Australia to see the coal-sack.

centres towards which different parts of the cloud contract and we can apply the Jeans criterion
to each separate fragment. The cloud may also rotate or have magnetic fields which would inhibit
gravitational collapse. External processes may trigger gravitational contraction such as a close
approach with another star, or a direct collision with another cloud or a shock wave from a nearby
supernova explosion. Astronomers use supercomputers to include all these effects and simulate
the formation of the first stars in the Universe, a snapshot of which is shown in Figure 15.

Figure 15. Star birth in the early Universe: This computer-simulated image shows the formation of
two high density regions (yellow) in the early universe, approximately 200 million years after the Big
Bang. The cores are separated by about 800 times the distance between the Earth and the Sun,
and are expected to evolve into a binary or ‘twin-star’ system. Image and caption courtesy of Ralf
Kaehler, Matthew Turk and Tom Abel.
2: STAR BIRTH AND EVOLUTION 41

2.3.2. Protostars: Hayashi Tracks on the HR diagram

Once a gas cloud starts collapsing under gravity it is called a protostar and it lies on the cool
side of the HR diagram. Detailed calculations by Henyey and then Hayashi between 1950-1960
predicted how the temperature and luminosity the protostar would change as the gravitational
collapse heats up the gas. Figure 16 shows the results of this theoretical modelling for a range of
cloud masses. We call the path that the protostars follow a Hayashi track. We can understand
the different phases of the collapse qualitatively from the Stefan-Boltzmann law (equation 1.41),
as described in the Figure. As we will see often in this course, the mass of the star changes the
way it evolves and the more massive stars are hotter (more gravitational potential energy to heat
them up) and are therefore more luminous. Once nuclear fusion has commenced in the core,
and the star has settled into hydrostatic equilibrium (see section 1.4), the star is called a main
sequence star.

1. The protostar is collapsing


Brighter under gravity. It is much
bigger and hence brighter
than the main sequence star
it will become.
2. As the protostar
shrinks, the luminosity
decreases: remember
the Stefan-Boltzmann law
L = 4πR2 σTe4 .
3. As the temperature
increases, nuclear fusion
starts. This increases both
the luminosity and the
temperature
Hotter Cooler 4. The star finally joins the main
sequence when hydrostatic
equilibrium is reached.

Figure 16. Hayashi Tracks of Protostars of different masses. The four stages of the evolution of
a 1 solar mass protostar (like our Sun) are listed on the left. For more massive protostars (upper
tracks), the luminosity hardly changes on the path from protostar to main sequence (the Hayashi
tracks are horizontal). This is because core nuclear burning starts much sooner shortening the
contraction phase. Image from CSIRO, Australia Telescope Outreach and Education.

The main evidence to support the theory behind protostars is the existence of T-Tauri stars which
are a class of star found in the region of the HR diagram just above the main sequence. They
usually appear to live in or near dense clouds and their brightness varies indicating that the star
is unstable. Often T Tauri stars show evidence of a thin disk of gas called a proto-planetary disk.
Figure 17 shows Hubble Space Telescope observations deep into the Orion nebula that reveal
images of proto-planetary disks10. With all this evidence it is thought that the T-Tauri stars are

10
We cover the exciting topic of planet formation in our pre-honours Astrobiology course which runs in Semester 1.
42 INTRODUCTORY ASTROPHYSICS

protostars with their luminosity powered by gravitational collapse, as their core temperatures are
too low for nuclear fusion.

HST Observations of Protoplanetary disks

Figure 17. Hubble Space Telescope observations deep into the Orion nebula that reveals images
of proto-planetary disks. Orion is visible from the winter Edinburgh skies. See if you can locate
where these proto-stars might be in the sky and picture the stars and planets forming, just as
they would have done in our solar neighbourhood 4.5 billion years ago when our own Sun formed.
Images adapted from ESA/NASA HST archive.

2.3.3. Brown Dwarfs

Before we discuss Main Sequence stars, we should make a quick note of the failures in the
protostar class. If the temperature in the protostar never becomes high enough to start nuclear
burning, the core become extremely dense and ‘degenerate’ (see section 3.4.2). These failed
stars are called brown dwarfs and have a very low mass M < 0.2M . They can be detected as
cool stars in infra-red wavelengths. Brown dwarfs are not planets as they form from a gas cloud.
Planets form from the disks of matter that surround stars.

2.4. The Main Sequence

Once a star is in hydrostatic equilibrium (see section 1.4.2), burning Hydrogen into Helium in its
core, it is called a Main Sequence star. Roughly 90% of a stars life is spent in this state, on
the Main Sequence, which is seen as a diagonal band across the HR diagram crossing from
cool-redder-faint stars (lower right) up to the hot-bluer-bright stars (upper left) (see Figure 12).
Let’s first think qualitatively about the Main Sequence pattern to see if makes sense: bright stars
emit more light because they must be producing more nuclear energy than the fainter stars. You
should be familiar with the idea that the hotter a reaction is, the faster it goes and this is true for
nuclear fusion. More nuclear energy production therefore implies a hotter star and we have a
qualitative understanding of the main sequence pattern. This is a physics course though, so let’s
now try and gain a quantitive understanding!
2: STAR BIRTH AND EVOLUTION 43

At the core of the star, high energy photons are produced through a fusion process. These
photons then disco dance through the star (see section 1.5.3), re-distributing that energy by
heating up the outer layers of the star. The energy is then released as thermal blackbody
radiation. If the nuclear power produced in the core exceeds that escaping from the surface,
then the net increase in the stars thermal energy would cause the star to expand, lowering the
central pressure, and hence the central temperature (see section 1.4). The number of fusion
reactions per second, per kilogram,  is related to the temperature T and density ρ as
(2.12)  ∝ ρT n
where n = 4 for the p-p chain (see section 1.5.2). Hence, if the temperature or density decreases,
the number of fusion reactions decreases and the nuclear energy power decreases.
Exercise 25. Consider what would happen if the nuclear power produced in the core was less
than the thermal power emitted from the surface of the star. Can you put together an series of
events that would restore the balance between the power produced at the core and the power
emitted from the surface?

We can see that stars have their own safety valve to control the speed of nuclear reactions at the
core, and it is this fine balance between nuclear energy production, and thermal energy emission
that leads to the characteristic shape of the Main Sequence.

Consider the thermal energy in a small cube of temperature T inside the outer layers of a star.
The radiation energy density of this cube is given by
4σT 4
(2.13) rad =
c
11
which can be derived from equation 1.39. Summing up over all the cubes in the stars we
find that the total thermal energy ∝ R3 Tc4 where Tc here is the central temperature of the star
making the approximation that the temperature throughout the star is linearly related to the
central temperature.
Exercise 26. Using the Pressure equations 1.20 and equation 1.23, show that the central
temperature is related to the Mass and Radius of the star
M
(2.14) Tc ∝
R
and hence that the total thermal energy of a star is
M4
(2.15) thermal energy ∝
R

The luminosity emitted by the star is just the power, i.e energy emitted per second or
thermal energy
(2.16) L(emitted) =
time
11
A non-examinable footnote. Dimensionally this equation makes sense as we now have an energy density in units
J/m3 . In equation 1.39 we calculated the energy emitted from a blackbody, per unit area, per unit time given by
σT 4 . In time ∆t the radiation emitted from an area ∆A will be σT 4 ∆t∆A. In that time, if the radiation travelled
only perpendicular to the surface of the black body, the radiation will travel a distance ∆x = c∆t covering a volume
V = ∆x∆A = c∆t∆A. The energy density (energy per unit volume) is therefore T 4 ∆t∆A/V = σT 4 /c. If you’re
now wondering where the extra factor of 4 in equation 2.13 comes from, it’s because you need to consider the light
radiating at all angles from the surface of the black body, not just perpendicular to it. You can read more about this
derivation here.
44 INTRODUCTORY ASTROPHYSICS

Because we know that the thermal energy emission has to be in perfect balance with the nuclear
energy production, we can conclude that the thermal energy emission timescale is set by the
radiative transfer timescales of a photon’s journey through the star. From section 1.5.3 we know
that the photon experiences N = R2 /l2 collisions where the distance between collisions l is
called the mean free path. Photons travel at the speed of light, so the time to travel from the
core to the edge of the star is

distance Nl R2
(2.17) time = = ∝
speed c l
In a massive hot star, all the electrons are stripped from their atoms and the scattering of the
photons depends almost entirely on the density of electrons such that l ∝ 1/ρ (the higher
the density, the shorter the distance between electrons and hence the shorter the path length
between collisions). As density ρ ∝ M/R3 we find that

M 4 R3
(2.18) L(emitted) ∝ ∝ M3 for massive stars
R M R2

For less massive stars, some electrons are still bound to their nuclei and so the mean free path,
or the distance between collisions l depends on temperature as well as density and is given by
(trust us on this one)

Tc3.5
(2.19) l∝
ρ2
Exercise 27. Show that the relationship between a stars Luminosity L and its Mass M for a low
mass star producing energy through the p-p chain is given by

(2.20) L(emitted) ∝ M 5.46 for low mass p-p chain stars

Difficulty factor warning: High alert

Hint: Follow the steps above but use a mean free path l ∝ Tc3.5 /ρ2 to show that L(emitted) ∝
M 5.5 /R0.5 . In order to remove R from the relation, equate the thermal energy emitted L(emitted)
to the nuclear energy produced L(nuclear) ∝ M where  is the number of number of fusion
reactions per second, per kilogram, given in equation 2.12

For medium mass stars a different fusion process takes place in the core called the CNO cycle.
This process still fuses Hydrogen into Helium, but using Carbon as a catalyst. This reaction is
very efficient and scales as  ∝ ρT 16 (see Star Quiz 3 and the second Problem Solving Workshop
for more on the CNO cycle).

We’ve now built up a quantitive understanding for how stars populate the Main Sequence. The
conclusion that we can draw is that it all depends on the mass of the star M ! Astronomers
typically cheat and roughly combine the low mass and high mass results above together to say

(2.21) L ∝ M4 . Luminosity-Mass relationship for stars

which is roughly correct for all star masses.


2: STAR BIRTH AND EVOLUTION 45

2.4.1. Main Sequence lifetime

The time that a star spends on the main sequence, burning Hydrogen into Helium, will depend
on the amount of fuel, and the rate at which it is consumed, i.e
M 1
(2.22) tmain sequence ∝ ∝ 3
L M
where we have used the rough Luminosity-Mass relationship from above. This means that the
brightest and most massive main sequence stars have the shortest lifetimes: they live fast and
die young. This is, perhaps, counterintuitive because the most massive stars have the most
fuel so one might expect them to live longer? As the nuclear fusion process depends on
temperature (see equation 2.12), and as massive stars are hotter (go back to the Gravity vs
Pressure discussion in section 1.4), it is indeed the most massive stars that use up their fuel
first.

2.4.2. Stellar Sizes

Figure 18 shows a schematic Hertzsprung-Russel diagram indicating the masses of the stars
along the main Sequence with the cool-red-faint stars (lower right) with low masses M ∼ 0.25M ,
and the hot-blue-bright stars (upper left) with high masses M ∼ 40M . It also shows lines of
constant size which can be calculated from the Stefan-Boltzmann law (equation 1.41) once
the Luminosity L and surface temperature Te is measured. Along the main sequence, the
cool-red-faint stars are the smallest stars, and the hot-blue-bright stars are the biggest. In the
next section we will see what happens to a star like our Sun when it runs out of fuel and evolves
off the main sequence. The path the star takes is shown in figure 18 as a black curved line, and
you can see that our Sun is going to get bigger, and brighter as it moves up the diagram and
turns into a Red Giant.
Exercise 28. Use the Stefan-Boltzmann law to calculate the size of a star with a Luminosity
L = 105 L and surface temperature Te = 30, 000K.

2.5. Red Giants and Horizontal Branch Stars

For the next stage of our understanding of the evolution of stars we will be quite qualitative,
understanding the path that a star takes across the HR diagram in terms of the next stage of the
battle between gravity and pressure. Figure 19 shows a diagram of a stars journey from stable
Hydrogen core burning (on the Main Sequence), to Hydrogen Shell Burning (as a Red Giant:
3-4), to stable Helium Core burning (on the Horizontal Branch: 5).

When the Hydrogen fuel in the core runs out and nuclear fusion switches off, the pressure that
was supporting the core will decrease and the star will start to collapse under gravity. This
shrinking will release gravitational potential energy and the Luminosity of the star will increase
(Figure 19: 1-2). As the core shrinks, fresh Hydrogen from the outer layers of the star will be
pulled into hotter regions and ignite (Figure 19: 2). This shell-burning provides a new source
of energy which heats up the intermediate layers in the star, increasing the thermal pressure,
causing the whole outer envelope of the star to expand and hence the surface temperature that
we see to decrease (Figure 19: 2-3). The expansion of the outer envelope changes the opacity
of the star: it becomes much easier for photons produced by the nuclear fusion reaction to travel
46 INTRODUCTORY ASTROPHYSICS

Figure 18. A schematic Hertzsprung-Russel diagram showing the location of the Main Sequence,
Giants and White Dwarf stars. Lines of constant stellar radius are shown (diagonal lines). The black
curved line shows the evolution of a star, like our Sun, once it has run out of Hydrogen and it moves
off the Main Sequence. Image from Frontiers of Science, Columbia University.

through12. The star then becomes more and more luminous (Figure 19: 3-4) and is at this stage
known as a Red Giant. Whilst the outer envelope is expanding, the core is still shrinking and
heating up and eventually the Helium in the core will ignite (Figure 19: 4) and the star will settle
back into hydrostatic equilibrium with a Helium burning core and a Hydrogen burning shell. This
is known as the Horizontal Branch phase (Figure 19: 5).

Clever Student Question No 5: “That’s a great story, but show me the evidence”

Figure 20 shows the HR diagram for two different clusters of stars where each dot on the HR
diagram shows the measured luminosity and temperature of a star in the cluster. Visually, the
main difference that we observe between the open cluster (imaged left) and the globular cluster
(imaged right) is that globular clusters have many more stars, packed together in the cluster,
in comparison to open clusters. The differences between their HR diagrams, however, tells us
much more about the history and evolution of these two astronomical objects.

12
Radiative Transfer: aka the photon crazy disco dance
2: STAR BIRTH AND EVOLUTION 47

Hydrogen core burning ends and the Helium core radius


decreases causing the Luminosity to increase (1-2)
1 Hydrogen pulled into hotter regions
Brighter 4 He ignites (2)

He 2
Log (Luminosity)

5 He

2
3 The expansion changes Hydrogen shell burning
Ma the opacity of the star
making it more luminous He heats up the envelope of
the star causing it to
in (3-4: Red giant phase) expand and cool (2-3)
Se 1
qu
en 3 4/5
ce
Blue Red He The core is still shrinking
and heating up until the He
Hotter Surface Temperature Helium ignites (4). The
star becomes a
horizontal branch star
burning Helium in its
core (5)

Figure 19. From the Main Sequence to the Horizontal Branch: a stars journey across the HR
diagram from stable Hydrogen core burning (on the Main Sequence), to Hydrogen Shell Burning
(as a Red Giant), to stable Helium Core burning (on the Horizontal Branch).

Recall our theory of star birth from section 2.3, where clouds of gas collapse under gravity, raising
the core temperature to a high enough level for nuclear fusion to start. Massive gas clouds won’t
collapse to form a single star, however, as they tend to be lumpy and fragment into individual
clumps which then contract and heat up. This theory of star birth would conclude that stars form
in clusters, around the same time, and that all the stars in a single cluster will have the same
composition (as they all form from the same cloud). The only difference between the individual
stars in a cluster is therefore going to be their mass.

The first thing we can observe by comparing the two star cluster HR diagrams is that the Main
Sequence is longer for M67, the open cluster, in comparison to M4, the globular cluster. This
means that the most massive stars in M4 have already evolved off the main sequence (remember
that on the Main Sequence, the luminosity L ∝ M 4 , so the high luminosity part of the main
sequence that is missing from the M4 HR diagram must be the most massive stars in the cluster).
In the globular cluster, we can see that the missing Main Sequence stars have evolved all the way
along the red giant branch and some have even settled onto the horizontal branch, as predicted
by our qualitative look at stellar evolution.

The second thing we can observe by comparing the two star cluster HR diagrams is that the
globular cluster is significantly older than the open cluster. Equation 2.22 shows that the main
sequence lifetime is proportional to a stars Mass. If the massive stars that have already left the
main sequence in the globular cluster, are still on the Main Sequence in the open cluster, this
tells us that the globular cluster was formed long before the open cluster.
48 INTRODUCTORY ASTROPHYSICS

M67 M4
Luminosity

Luminosity

Temperature Temperature

Figure 20. The Hertzsprung-Russel diagram for two star clusters; M67, an open cluster (imaged
left) and M4, a globular cluster (imaged right). Image adapted from D. Perley, Berkeley Physics
Department.

Interestingly we find that globular clusters are typically found in the halo of our Milky Way galaxy,
and open clusters are typically found in the disk. This is seen as good evidence to support the
theory that it was the halo of our Milky Way galaxy that formed first, and the blue wispy spiral
arms in the disk that formed later. More on this subject in the galaxies part of the course!
Exercise 29. Use the schematic HR diagram in figure 19 to annotate the data from the star
clusters in figure 20. Label the Main Sequence, the Red Giant Branch, and the Horizontal
Branch. What would the HR diagram of a newly formed star cluster look like?

2.6. Summary

In this section we looked at how we use two basic measurements of a stars luminosity L (how
bright it is) and its colour (which we can convert to a surface temperature using Wiens Law 1.45)
to learn about the evolution of stars in the Universe. Plotting these two quantities on the
”Hertzsprung-Russell” diagram we see that stars prefer to live in certain regions of Luminosity
and temperature. We made quantitative arguments to understand the slope of the Main Sequence,
where stars are powered by nuclear fusion in the core, fusing Hydrogen into Helium. We then
made qualitative arguments to understand the evolution of a star in Luminosity and surface
2: STAR BIRTH AND EVOLUTION 49

temperature after the fuel in the core runs out. We compared our theory to data from an open
star cluster and a global star cluster and found that the data supported our theory.

From this result we can draw some very important conclusions

• Star change as they age and evolve


• Our own Sun will one day turn into a red giant, and its radius will expand to roughly 40
times its current radius (see Figure 19) which brings it dangerously close to the orbit of
the Earth.

and what is exciting is that we can draw these conclusions without building a time machine to
zoom through space and time to see what will happen to our Sun in the future! In the next
chapter we’ll ask the question what happens when the Helium in the core runs out, also known
as Star Death!
3: STAR DEATH

3.1. Outline of content


• Pulsating Stars
• Planetary Nebulae
• White Dwarfs
• Supernovae and Black Holes

3.2. Pulsating Stars

“Twinkle twinkle little star, how I wonder what you are”.

If we make repeated observations of evolved stars in dense globular clusters, we find that some
of them twinkle: they get brighter, then they dim, then they get brighter again13. These pulsating
stars live in the HR diagram in a region termed the instability strip. This can be seen in the HR
diagram of globular cluster M4 (Figure 20), as a ‘hole’ in the horizontal branch. What processes
could be occurring inside the core and envelope of the star to cause these pulsations? To
answer this question we first need to extend our knowledge of nuclear fusion processes and
then understand how that impacts on the physics occurring in the very outer envelope of the
star.

3.2.1. The triple-alpha process

Horizontal branch stars are fuelled by a core Helium fusion reaction known as the triple-alpha
process. Very high temperatures are needed for the two colliding Helium nuclei to overcome the
repulsive Coulomb force (see section 1.5) for a fusion reaction to occur.

The triple-alpha process has two steps which can be written symbolically. It starts with two
Helium nuclei (also known as alpha particles, hence the ‘alpha-process’), fusing together to
make Beryllium 84 Be
13
If you want to witness this for yourself, look at the animated HST observations of M3 from Western Michigan
University: http://homepages.wmich.edu/~korista/stargal-images/M3-rrlyraemov_stanek.gif
51
52 INTRODUCTORY ASTROPHYSICS

4
(3.1) 2 He +42 He →84 Be + γ

followed by an additional reaction with a third Helium nucleus (hence ‘triple alpha process’)

8
(3.2) 4 Be +42 He →12
6 C+γ

producing Carbon 126 C. In very high temperature cores, the Carbon


12
6 C can also catch an
additional Helium nucleus to make Oxygen

12
(3.3) 6 C +42 He →16
8 O+γ

It is this reaction in the cores of horizontal branch stars that provides the main source of the
Oxygen that you’re currently breathing14.

The Beryllium 84 Be made in the first step of the reaction is very unstable and the back-reaction
(where the Beryllium nucleus decays back into two Helium nuclei) occurs easily. This makes
the reaction very sensitive indeed to the temperature with the number of fusion reactions per
second, per kilogram,  related to the temperature T as
(3.4)  ∝ ρT n
with n = 40 for the triple-alpha process. Even a very small change in temperature, therefore
results in a large change in the energy output from the core.

3.2.2. The Outer Envelope of a Horizontal Branch Star

A highly temperature dependent reaction is occurring deep in the core of our horizontal branch
star, varying the energy output. But we are unable to observe the core, as the only light we see is
radiated from the stars outer envelope. The outer envelope will gently expand (cool) and contract
(heat up) in response to the changes occurring in the core. During the horizontal branch phase
in a stars life, however, some new physics comes into play in the outer envelope that makes
this oscillation in the radius of the outer envelope highly predictable; as if the star is rhythmically
breathing in and out.

We’re used to thinking about the Helium that has been created in the stars core through the
p-p chain or the CNO cycle, but there is also a small amount of Helium that exists in the
outer envelope, some of which has been dredged up from the core. When the outer envelope
contracts, this Helium is compressed along with the Hydrogen. The majority of the gravitational
energy from the envelopes contraction is absorbed by heating up the compressed gas, increasing
the pressure, and causing the envelope to expand again15. There is however a shell in the stars
outer envelope where the temperature and density conditions are such that instead of heating
14
Please be at least slightly in awe at this point. Take a deep breath, and imagine the hot fiery core from where the
Oxygen, that is now coursing around your veins, came.
15
Remind yourself about hydrostatic equilibrium, covered in section 1.4.2
3: STAR DEATH 53

up the gas, the Helium and Hydrogen instead becomes ionised (looses its electrons). This is
known as the ionisation layer.

The photons radiating away from the core already have an erratic path as they collide with
particles as they attempt to exit the star (see section 1.5.3). When these photons hit the
ionisation layer, their mean-free path (the distance between collisions), significantly decreases
as there are suddenly many more charged electrons and nuclei for the photons to collide with.
This ionisation layer is therefore essentially opaque, and it effectively traps in the heat. The layers
in the stars envelope that are below the ionisation layer heat up further, the pressure increases
and they expand. This pushes the ionisation layer out to cooler temperatures, at which point
the Helium and Hydrogen nuclei in the ionisation layer recombines with the electrons, making
the outer envelope transparent once more. The photons stream out of the star, increasing its
brightness, until all the trapped heat has been released. The envelope has then cooled and
starts to contract again, and the whole process repeats.

Lets summarise what causes stellar pulsation with a bullet-point list;

1. The outer envelope contracts compressing the gas within it


2. Within a shell inside the envelope, the temperature and density conditions are such that
the compression results in the ionisation of the Helium and Hydrogen in the envelope
3. This ionisation layer is opaque and traps in the heat from the stars core
4. The gas in the envelope below the ionisation layer heats up and expands pushing the
ionisation layer out to cooler temperatures
5. The ions re-combine, and the envelope becomes transparent again
6. The trapped energy is released in a burst of light, and the outer envelope contracts again
7. The process repeats.

The time a star spends on the instability strip is short as it depends so critically on the density and
temperature being just right for an ionisation layer to form. If the star is too hot, the ionisation layer
will form to close to the star’s surface, and there won’t be enough mass above it to re-compress
and ionise the layer after the initial recombination stage. If the star is too cool, the ionisation
layer will form too deep into the envelope and turbulence from the core will prevent the build up
of heat. It’s this maximum and minimum temperature that defines the ’edges’ of the instability
strip (i.e the width of the hole in horizontal branch seen in the HR diagram of globular cluster
M4 in Figure 20). Eventually the horizontal branch star will come out of this unstable phase and
settle back into an equilibrium state.

3.2.3. RR Lyrae and Cepheid Stars

There are two known types of variable stars called Cepheids and RR Lyrae stars that live on
the instability strip. They differ in the time period for each pulsation with the lower luminosity
RR Lyrae stars pulsing every few hours, and the higher luminosity Cepheids pulsating every few
days to weeks, as shown in figure 21. It was Henrietta Leavitt , an astronomer in the early 1900’s,
who was the first to discover that the rate at which variable stars pulsate is dependent on their
luminosity.
Exercise 30. The luminosity of a pulsating RR Lyrae star typically varies by 4%. Assuming that
the surface temperature doesn’t change, calculate by how much the stars radius changes during
the pulsation?
54 INTRODUCTORY ASTROPHYSICS

Figure 21. Relationship between pulsation period and luminosity for various types of pulsating star.
The Type I Cepheids are the most useful for distance measurements, because they are so bright
which means we can use them as a standard candle in section ??. Image from CSIRO, Australia
Telescope Outreach and Education.

Hint: You might find the Stefan-Boltzmann-law, equation 1.41, useful to solve this exercise.

We can see crudely how the relationship between pulsation period P and the stars luminosity L
occurs. When the star has expanded and cooled and the ionisation layer has recombined and
become transparent, gravity will significantly dominate over the pressure. The subsequent initial
collapse is therefore just free-fall under gravity. In exercise 8 we showed that a mass in free-fall
collapsing under gravity would take a time P that is proportional to the mass M and radius R as
r
R3
(3.5) P ∝
M
We can use this proportionality relationship between free-fall time and the physical properties
of the star to describe the variable stars pulsation period, even though we know of course that
pressure will eventually halt that free-fall. What we would like to do, however, is turn this into a
relationship in terms of observable parameters, luminosity L and temperature T .
Exercise 31. Show that the time-period of a variable star is related to its luminosity L and
temperature T as
(3.6) P ∝ L5/8 T −3
3: STAR DEATH 55

Hint: You’ll need to use the Stefan-Boltzmann-law, equation 1.41 and the relationship between a
stars Luminosity and Mass L ∝ M 4 from section 2.4

From this result we can see that the pulsation period depends on both luminosity and temperature.
The formation mechanism of the ionisation layer, which causes the pulsation, means that the
temperature range over which stars pulsate is very limited. This therefore makes the bright
Cepheid stars excellent standard candles which can be used to measure distances across the
Universe. More on this in the Galaxies part of the course in section ??.
Exercise 32. From the data shown in Figure 21, estimate the power-law index for the dependence
of luminosity on period observed for Type I Cepheids. How does it compares to your theoretically
derived relationship in exercise 31.

Hint: The data is plotted on a ‘log-log’ scale e.g y = log(L) vs x = log(P ). You’ll need to
remember your log-rules that log(ab ) = b log a.

3.2.4. Stellar Winds

In this section we shall prove that the gravitational pull at the surface of a Red Giant and a
Horizontal Branch Star is much less than it was during its main sequence lifetime. Atoms in giant
star atmospheres can therefore more easily escape resulting in a stellar wind from Red Giant
and Horizontal branch stars.

Let’s consider a star of mass M and radius R. The potential energy that a Hydrogen atom has
on the surface of this star is
−GM m
(3.7) PE =
R
The kinetic energy is given by
1
(3.8) KE = mv 2
2
where m is the mass of the atom and v is the velocity of the particle. If the sum of the kinetic
energy (encouraging the atom to leave the star) and the potential energy (encouraging the atom
to stay bound to the star) is greater than zero, the atom will leave the star.
Exercise 33. Show that for an atom to leave the surface of a star, its velocity must exceed the
stars escape velocity given by
r
2GM
(3.9) ve =
R

When our own Sun turns into a Red Giant, its Luminosity will increase by a factor of 1000 and
its colour will change to red. Can we calculate the difference between the escape velocity of
particles from our Sun now, and 5 billion years into the future?

First we will assume that any Main Sequence mass loss is negligible such that the mass of the
Red Giant MRG = 1M , the mass of the Sun now. Any change in escape velocity therefore
depends only on the change in the size of the star.
r
ve (RG) R
(3.10) =
ve ( ) RRG
56 INTRODUCTORY ASTROPHYSICS

To calculate the size of the Red Giant we can combine Wien’s Law (equation 1.45) to estimate
a surface temperature, with the Stefan-Boltzmann law (equation 1.41) (see section 2.4.2). It’s
tempting to put numbers in at the start, but you’ll be less prone to calculator errors if you work
through the algebra, adding numbers only at the end. Using our knowledge that the Sun is
actually green (see section 1.6.1) we can write
2.9 × 10−3 λ(green)
(3.11) TRG = = T
λ(red) λ(red)
Re-arranging the Stefan-Boltzmann law we have
r
L
(3.12) R=
4πσT 4
and hence
s
4
R L TRG
(3.13) =
RRG LRG T 4
We can now plug in some numbers: λ(green) ≈ 500nm, λ(red) ≈ 650nm, and LRG = 1000L ,
s  4
R −3
500
(3.14) = 10
RRG 650
to find that RRG ≈ 60R : the red giant is twenty times larger than its younger main sequence
self, and that the escape velocity is a factor of ∼ 8 higher for the main sequence star, compared
to the red giant star.
Exercise 34. A blue Cepheid Star is 6000 times more luminous than the Sun and twice as
massive as the Sun. Calculate the escape velocity for a Hydrogen atom on the surface of the
Cepheid.

Figure 22. Image of Alaska and the Aurora photographed by an STS-123 Endeavour crew-member
on the International Space Station on March 21st 2008 (taken from the Telegraph Online).

The qualitative conclusion that we can draw from this exercise is that the larger the star is, the
lower the escape velocity, and the higher the chance for atoms to leave the surface of the star.
This is called a stellar wind. The Sun is currently losing ∼ 10−14 M kg per year, the effects
3: STAR DEATH 57

of which we observe as Aurorae (or the Northern Lights). It’s only the highest energy particles
in the outer envelope of the Sun which have velocities greater than the Sun’s escape velocity.
These high energy particles collide with molecules of oxygen and nitrogen in the upper layers
of our own atmosphere (as seen in Figure 22). These collisions excite the molecules, causing
the bound electrons to jump to higher energy orbits. When the electrons eventually return to
their ground state (e.g n = 1 in the Rydberg formula, equation 1.12) they emit light. The vibrant
colours seen in the Aurorae depend on which electron transition in which molecule has occurred.

When the Sun enters the red giant stage, we calculated that the escape velocity decreases by a
factor of ∼ 8. A detailed calculation finds that the mass loss will increase by a factor of ∼ 10 to
∼ 10−13 M kg per year when the Sun is in its Red Giant phase.

3.3. Planetary Nebulae

In this section we’ll make a qualitative argument for the physics that results in the production of
the stunning Planetary Nebulae16 seen in Figure 25. Again we go back to the on-going battle
between gravity and pressure which recommences when the Helium in the core of our Horizontal
Branch star runs out and core nuclear reactions cease leaving a carbon-oxygen core. This time
Gravity wins over pressure in the cooling core and the core contracts. Helium deposited by the
Hydrogen burning shell, is pulled towards the centre, and the temperature rises as a result of
the increased pressure. A Helium shell ignites, the outer envelope expands and the star moves
up the HR diagram again along the asymptotic giant branch (see the extended HR diagram in
Figure 23).
Exercise 35. Without looking back to section 2.5 can you construct a qualitative argument for
the changes in a stars Luminosity and Surface Temperature, as mapped out on the HR diagram
in Figure 23, for each stage of a stars journey from the main sequence to the Asymptotic Giant
Branch? Check your argument with the discussion in section 2.5

The structure of the giant star at this point is starting to resemble an onion with different temperature
dependent reactions occurring at different temperature layers (see Figure 24). As the temperature
in the Helium burning shell grows, the shell should expand to compensate, but it is thought
that the increased pressure is too small to lift the material above it. The temperature and
Helium burning rate therefore increases rapidly (recall equation 3.4) until the shell pressure is so
significant, that there is a rapid release of energy called a Helium shell flash.

3.3.1. Heavy-Element Production: s and r processes

Stars need fuel to keep shining, and as the fuel runs out, the star needs to get progressively
hotter for the next fusion step to occur. For intermediate mass stars (3-8 solar masses) Carbon
burning starts in the core which produces Neon and Magnesium, but as the elements get heavier
and heavier the Coulomb repulsion becomes stronger, and not even quantum tunnelling magic
16
Planetary Nebulae have absolutely nothing to do with planets. They were first named by William Herschel in the
1780’s because when viewed through his telescope, they were extended sources, much like the planets in contrast
to the point-source stars. Their name really should have been changed, but astronomers do love to confuse, so
they’ve remained Planetary Nebulae!
58 INTRODUCTORY ASTROPHYSICS

Figure 23. From the Main Sequence to the Asymptotic Giant Branch: a stars journey across the
HR diagram from stable Hydrogen core burning (on the Main Sequence), to Hydrogen Shell Burning
(as a Red Giant), to stable Helium Core burning (via the instability strip on the Horizontal Branch), to
layers of Helium and Hydrogen shell burning with an inert Carbon-Oxygen core (as an Asymptotic
Giant Branch star).

can allow heavy elements to fuse. We know these heavy chemicals exist though, so how did
they come about?

The Coulomb force between two charged particles separated by distance r is

kc |q1 q2 |
(3.15) F =
r2

Where F is the force, kc is a constant q is the charge on each particle. If we want to overcome
this force, we therefore have to use uncharged particles; also known as neutrons. A neutron
has zero charge q = 0 and so F = 0 in the case where neutrons fuse with nuclei.
3: STAR DEATH 59

Figure 24. The stellar structure of an asymptotic giant branch star which is composed of an inert
carbon-oxygen core surrounded by a shell of Helium undergoing a triple-alpha fusion reaction,
surrounded by a shell of Helium, surrounded by a shell of Hydrogen undergoing a p-p chain or
CNO cycle fusion reaction, surrounded by an outer envelope of neutral Hydrogen. The pressure in
the inner Helium burning shell builds rapidly and is released in a series of bursts called Helium shell
flashes.

From this we can conclude that heavy Elements are produced through neutron bombardment
which happens on both ‘slow’ timescales, known as an s-process, and ‘rapid’ timescales, known
as an r-process.

Let’s write this out symbolically:


(3.16) A
Z Element +10 n →A+1
Z Element
remembering that the Mass Number A is the number of protons plus neutrons, and the Atomic
Number Z is the number of protons.

For an s-process, where the neutron capture is slow, this new heavy Element will decay, converting
the captured neutron into a proton, emitting an electron and a neutrino, before it can catch
another neutron:
A+1 −
(3.17) Z Element →A+1
Z+1 Element + e + ν̄

For an r-process, where the neutron capture is rapid, this new heavy Element can catch another
neutron:
A+1
(3.18) Z Element +10 n →A+2
Z Element
and can keep catching neutrons and building up mass in this way.
Exercise 36. Starting with Neon: 20
10 Ne, write down a combination of s-process and/or r-process
reactions to build Sodium 21
11 Na and Magnesium 22 12 Mg.

In the giant stars, the neutron density is not high enough for r-processes to occur (this only
happens in a massive supernova explosion, discussed in section 3.5). Many s-process reactions
60 INTRODUCTORY ASTROPHYSICS

Figure 25. A montage of planetary nebulae as imaged by HST. Images from NASA STSci.
3: STAR DEATH 61

do occur, however, producing massive elements. Each shell Helium flash disrupts the star,
dredging up elements from the core. A superwind is triggered by the pulsations from the Helium
flashes and in approximately 1000 years the outer envelope of the star is thrown off leaving
behind the hot core. All the chemicals made in the nuclear reactions are thrown into space,
which we see as beautiful planetary nebulae (see Figure 25).
Exercise 37. Why are planetary nebulae all different colours and different shapes?

3.4. White Dwarfs

Over time the beautiful planetary nebulae, created by violent pulsations and superwinds from the
death throes of giant stars, drift away to become part of the interstellar medium. This gas will
eventually start to clump and re-collapse to form new stars and planets. The giant stars core,
however, still remains, and is observed as a luminous, blue star: called a white dwarf. White
dwarf stars populate the lower left corner of the HR diagram (see Figure 18).

Let’s use our armoury of physics tools that we’ve developed over this course to determine the
properties of white dwarf stars. First we start with our two fundamental observations: brightness
and colour. For a typical white dwarf, we measure
(3.19) LWD = 0.001L λWD = 300nm
Exercise 38. Using these two observations, calculate the effective surface temperature of the
White Dwarf and its radius.

Hint: Use Wien’s Law (equation 1.45) followed by the Stefan-Boltzmann Law (equation 1.41).

You should have found that the effective surface temperature of the White Dwarf Te ≈ 10, 000K
and the White Dwarf’s Radius is about the same size as planet Earth, with RWD ≈ 7500 km.

To calculate the mass of the Sun in our first Problem Solving Workshop, we looked at the orbit of
planet Earth and balanced the centripetal force with the gravitational force. Astronomers can use
a similar technique to determine masses of other stars, provided they come in pairs. These are
known as binary systems. Observations of white dwarf binaries calculate a white dwarf mass
of roughly MWD = 1M . This is the same mass as our own Sun but compressed into a much
smaller volume.

Calculating the density of the white dwarf we find


3MWD
(3.20) ρWD = 3
≈ 109 kg m−3
4πRWD
The white dwarf will be composed of mainly Carbon and some Oxygen, as the end products of
the triple-alpha process (see section 3.2.1) that was occurring in the core of the giant star before
it shed its outer envelope and turned into a white dwarf.

Following the same argument as we made for our own Sun, the existence of White Dwarfs
demonstrates that the gravitational forces trying to pull the matter together must be balanced by
pressure forces, otherwise the star would collapse.
62 INTRODUCTORY ASTROPHYSICS

Exercise 39. Following the hydrostatic equilibrium derivation, in section 1.4, show that, assuming
the white dwarf can be described as an ideal gas, the central temperature for the white dwarf is
given by

GMWD mC
(3.21) Tideal gas = ≈ 1010 K
2kRWD

For simplicity assume that the white dwarf is made up of Carbon atoms, for which the mass of a
single Carbon atom is mC = 2 × 10−26 kg.

At this staggeringly high temperature, the next phase of fusion burning would commence, turning
the white dwarfs Carbon and Oxygen core into Neon and Magnesium and the luminosity would
rocket up. Clearly this isn’t happening though so something is wrong with our assumptions.

We can estimate the typical spacing between atoms in the star by looking at the number density
of Carbon atoms:
r
mC
(3.22) d= 3 ≈ 10−12 m
ρWD

This separation is a factor of ten smaller than the Bohr radius that we calculated in exercise 5
where rb = 5.3 × 10−11 m. The Bohr radius is the closest possible separation between an electron
and a proton in a Hydrogen atom. From this comparison we can therefore conclude that the
carbon nuclei and electrons inside a white dwarf are packed together at such extreme densities
that electrons start to overlap with each other. When this happens quantum magic comes back
into play. Our earlier approximation that the white dwarf behaves like an ideal gas is invalid: the
gas in a white dwarf is called degenerate.

3.4.1. Electron Degeneracy

Let’s start with a founding principle of quantum mechanics:

The Heisenberg Uncertainty Principle: It is impossible to define the position x and momentum
p of a particle to an accuracy which is better than ∆x∆p ≥ ~/2

Here ~ = h/2π and h = 6.63 × 10−34 Js is Planck’s constant.

In a classical ideal gas, pressure arises from the random motion of particles, resulting from
their thermal energy (as long as T > 0). An electron gas (mixed in with ions such that the
material has an overall neutral charge) at T = 0 has no random thermal motion, but it does
have motion caused by quantum mechanical effects. If the mean separation between two
neighbouring particles is ∆x, then the two particles must have momenta that differ by more
than ~/2∆x so as not to violate the Heisenberg Uncertainty Principle. For extreme density
environments, such as the core of a White Dwarf, ∆x is very small, and hence, the average
particle momentum and velocity must be very high. This quantum random motion provides
a source of pressure, known as degeneracy pressure that can be much higher than thermal
pressure. It is this electron-degeneracy that supports a White Dwarf.
3: STAR DEATH 63

3.4.2. Degeneracy Pressure

Consider a gas made up of particles with mass m, and number density n, moving randomly in all
directions with a velocity v in each direction. Enclose these particles with a cylinder of length vdt
such that the average particles can travel the vertical length of the cylinder in time dt, as shown
in Figure 26. What pressure does this gas exert on the top circle of the cylinder which has an
area A?

Area = A

v Length = v dt

Figure 26. Diagram to illustrate gas pressure on the internal surface of a cylinder enclosing the gas.

Starting from Newton’s law, the force exerted by a single particle is


dv dp
(3.23) F = ma = m =
dt dt
and hence is given by the rate of change of momentum. Consider just the particle motion in the
direction perpendicular to the top end of the cylinder; in one second (dt = 1) half the particles will
bounce off the top, and half will bounce off the bottom, changing the direction of their momentum
such that dp = 2mv. Focussing on the pressure that this causes at the top of the cylinder from a
single particle, we can write
F 2mv
(3.24) P1 = =
A A
The total number of particles that hit the top of the cylinder is
nV nAv
(3.25) N= =
2 2
where we’ve used the equation for the Volume of a cylinder V = Ah, including the height of the
cylinder as v. The factor of 21 comes from the fact that only half the particles will move up the
cylinder. The total pressure is therefore P = N P1 = nmv 2 . Re-writing in terms of the momentum
we find
(3.26) P = npv

This formula applies equally to hot gas where the momentum is owing to thermal motion17, or to
degenerate matter, where the momentum is owing to quantum motion. From equation 3.22 we
17
For a thermal gas, the kinetic energy in all 3 directions 3(mv 2 /2) = 3kT /2, and hence from equation 3.26 we
recover the ideal gas law P = nkT in equation 1.23
64 INTRODUCTORY ASTROPHYSICS

can write the mean separation between two neighbouring particles is ∆x in terms of the number
density n
1
(3.27) ∆x = n− 3
The minimum momentum for a particle (assuming zero thermal motion) is then given by the
Heisenberg Uncertainty Principle
1
~n 3
(3.28) p=
2
Folding this into equation 3.26, remembering v = p/m, we find that the degeneracy pressure
5
~2 n 3
(3.29) Pdegeneracy = Degeneracy Pressure
4m

From this we can see that quantum random motion of electrons within the white dwarf provides
a source of pressure that is completely independent of the temperature.
Exercise 40. A star is composed of ionised hydrogen, with a number density of electrons ne =
4.6 × 1031 particles m−3 in the core. What is its temperature if the momenta of the electrons due
to their thermal motions equal the uncertainty in their momenta due to the Uncertainty Principle?

Hint: To determine the thermal momentum, recall the relationship between kinetic energy and
thermal energy in equation 2.6.

3.4.3. The curious relationship between a White Dwarf’s Mass and Radius

For a normal, typical main sequence star, where gravity is supported by thermal pressure, and
no quantum trickery applies, we found in exercise 27 that for a p-p chain reaction fuelled stars
R ∝ M 0.08 . The more massive a main sequence star is, the bigger it is in size. To our classical
minds this makes sense; more stuff = more volume. In quantum magic land however things don’t
always behave as our classical minds would expect.
Exercise 41. A white dwarf star is in hydrostatic equilibrium such that the central pressure is
related to its mass M and radius R as
3GM 2
(3.30) Pc = .
8πR4
This result comes from section 1.4, equation 1.20. By equating the central pressure to the
degeneracy pressure, derive a relationship between the mass of the White Dwarf M , and its
radius R.

Clever Student Question no 6: Fascinating! We’ve looked at material at extreme densities


and have concluded that quantum uncertainty provides the pressure needed to support very
dense stars like white dwarfs. What happens to higher density stars? As the white dwarf
mass increases, we’ve shown, bizarrely, that the white dwarf radius decreases. Surely there
must come a point when the mean particle separation ∆x is so small and the corresponding
uncertainty momentum is so high that particles become relativistic and their speeds v → c.
What happens then?
3: STAR DEATH 65

This is an excellent question, solved by Chandrasekhar, who discovered that the requirement that
particle speeds were v ≤ c lead to a maximum possible mass for a star supported by electron
degeneracy pressure: M ≤ 1.4M . Dense stars more massive than this Chandrasekhar Mass
are then supported by neutron degeneracy pressure (Neutron Stars). Even more massive stars
end up as Black Holes (see section 3.6).

3.4.4. Massive Diamonds

As the white dwarf ages, whatever heat that was left after the end of its horizontal branch star
Helium core burning phase, eventually radiates away. Indeed it is this light that we detect from the
still luminous White Dwarf population. As the White Dwarf is supported by degeneracy pressure,
this cooling doesn’t result in the contraction of the star, and subsequent heating (as we’ve seen
for other phases of stellar evolution). Eventually a White Dwarf, left to its own devices, will
become the same temperature as the rest of the cool Universe. The Carbon in its core will form
a crystal lattice, with the degenerate electrons ‘whizzing’ around between the ions. In just a few
billion years the Earth-sized core solidifies into what must be the most amazing diamond that
humankind could ever imagine.

Looking into our scientific crystal ball to see what will become of our own Sun, it seems quite
fitting that our observations have lead us to this conclusion: the Sun, that fuels our very own
existence, will end up as a stunning diamond, the jewel in the crown, at the centre of our very
own solar system. It’s only a shame that this will take around 7 billion years, and so we won’t be
around to see it.

3.5. Supernovae

Star Death in its finest form ends up with a stunning multicolour planetary nebula and diamonte
white dwarf star. For stars that start life off brightly, as the most massive stars along the
main sequence, however, a more sinister death is in store. The most massive stars become
short-lived, very luminous, supergiants where the core temperature becomes so extreme that
successive fusion s-processes occur (see section 3.3.1). Heavier and heavier elements are
created in different shells within the star, resulting in an onion-like structure around a central Iron
core (see Figure 27). Iron is the most stable element; no further fusion reactions will produce
energy and so the core starts to contracts. Just as in the white dwarf star, electron degeneracy
pressure will increase as the star shrinks, but in these massive stars even it is unable to halt
the collapse. Large quantities of neutrons are created by the collapse as Iron nuclei are broken
up by photons. Elements heavier than Iron are then produced through rapid r-process reactions
with the new flux of neutrons. The core crushing continues and the core becomes neutron
degenerate. Neutron degeneracy pressure builds up quickly causing the inner part of the core
collapse to come to a sudden halt and rebound slightly. This sends a giant shock wave out
through the star, the end result of which is a giant supernova explosion (see Figure 27).
Exercise 42. How much energy is generated in the gravitational collapse of the core of a
supergiant? Assume the core has a Chandrasekhar mass with M = 1.4M and that neutron
degeneracy pressure halts its contraction at a radius of R = 15km.

Hint: You can use the energy equation that you calculated in exercise 22 for the gravitational
collapse of a gas cloud.
66 INTRODUCTORY ASTROPHYSICS

Figure 27. Internal Structure of a SuperGiant (left) just before it explodes as a supernova (right).
Image adapted from CSIRO, Australia Telescope Outreach and Education.

The huge amount of energy produced by the core collapse is dissipated in four different ways.
For a short time the supernova is a luminous as billions of stars. The energy also powers
the resulting shock wave, that rips through the star and carries the chemicals far out into the
interstellar medium. The neutron star that remains spins at a rapid rate. Neutrinos, and possibly
also gravity waves carry the rest of the energy out into the cosmos.

3.5.1. Supernovae 1a

Remember our happy calm white dwarf, slowly solidifying into a stunning diamond to stand the
test of time? Well sadly, in some cases, stellar cannibalism prevents this from happening. Stars
often come in pairs, which follows from our theory of star formation in section 2.3. Known as
binary stars, they will typically start life with different masses and hence evolve at different rates.
We will therefore often find binary pairings of white dwarf stars with other main sequence stars18.
The gravitational pull of the neighbouring white dwarf strips the outer layers of its companion,
18
You may have already learnt that supernovae 1a originate from a binary pairing of a white dwarf with a red giant
star. This was the leading theory for a long time but has recently been ruled out by a series of nearby supernova 1a
observations where the white dwarf’s companion is detected and found to be burning Hydrogen at its core.
3: STAR DEATH 67

depositing them on the surface of the white dwarf. As the white dwarf cannibal gains in mass, its
radius shrinks (see exercise 41), the uncertainty electron momentum increases until the electron
speeds become relativistic. When the mass exceeds the Chandrasekhar limit with M = 1.4M ,
the white dwarf diamond core starts to collapse. When the compressed core finally becomes
neutron degenerate the core collapse will come to a sudden halt and rebound slightly causing a
supernova explosion.

This particular type of supernova is called a supernova type 1a. It is a very useful type of
star, because each supernova 1a occurs at the precise moment when the mass exceeds the
Chandrasekhar limit. It therefore looks the same, at any point in space or time, and we can
therefore use it as a standard candle to probe the geometry of the Universe (more on this in the
Cosmology part of the course).

3.6. Black Holes

For the most massive dense stellar cores, even neutron degeneracy pressure cannot halt the
gravitational collapse, and the star collapses into an infinitesimal point no bigger than this full-stop.
Frank Shu refers to the resulting Black Hole as “the Dracula of stellar corpses, lying in wait to
ensnare more matter to share its own sorry fate” (Shu 1981).
p
Consider the escape velocity from an object of mass M and radius R, ve = 2GM/R (equation 3.9).
As the object shrinks, there will be a critical radius where ve = c. This is known as the Schwarzschild
Radius

2GM
(3.31) RSch = Schwarzschild Radius
c2

If the object shrinks inside this radius, nothing can escape, not even light. The spherical surface
corresponding to the Schwarzschild Radius is known as the event horizon, because we can
never learn about any events inside the horizon, as information about them would need some
sort of physical communication, but no form of communication can travel faster than light. Frank
Shu describes the infall into the black hole perfectly so I will leave this section to him to conclude
:

“As you approach the event horizon, no force in nature - not the electromagnetic force, not the
strong force and certainly not the weak force - can prevent you from falling toward the black hole.
As you fall it appears to me - stationed comfortably at a safe distance from the black hole - that
it takes you formally an infinite time to reach the event horizon, as the black holes gravity bends
the whole of space and time. Nevertheless once you start falling, effective communication ends
between us in about a millisecond, because all photon signals you may frantically try to send
out to me either are gravitationally redshifted to undetectably long wavelengths by the time they
reach me, or they fall with you into the black hole. How do you react as your fall? According
to your watch it takes only a finite time to reach and cross the event horizon, and nothing very
peculiar happens as you cross, except that all the photons which you send outward using your
flashlight cannot escape the clutches of the black hole either. Well actually, one peculiar thing
does happen to you as you fall that I forgot to mention before sending you on this journey. As you
fall feet first toward the centre of the black hole, the gravitational pull on your feet is considerably
68 INTRODUCTORY ASTROPHYSICS

stronger than the gravitational pull on your head. This tidal force soon stretches you out in a long
long string. Sorry. As a consolation, if you survive this modern-day torture rack, as you cross
the event horizon, you will have entered another universe whose properties none of us on the
outside will ever sample.” (Shu 1981)

Clever student question no 7: Wow, I’ve always wanted to know what the event horizon was,
and don’t like the sound of spaghettification, but I don’t believe your calculation. If the black hole
bends space and time, then the Newtonian gravity that we assumed when calculating the escape
velocity can’t hold. Furthermore, that calculation was for a particle of mass m and light doesn’t
have a mass!

You’re quite right, our derivation of the Schwarzschild Radius is a lucky result in that equation 3.31
is correct. The derivation is however completely incorrect and to calculate it correctly I encourage
you to take a course in General Relativity!

3.7. Summary

We’ve now come to the end of our astrophysical study of stars. Figure 28 beautifully summarises
the theory of stellar evolution that we’ve put to the test with observational data. There are two
points to take away from this figure

• Live fast (massive), die young


• One star’s death, is a new star’s birth

As the cycle plays around, producing a new generation of stars, there is an increase in the
fraction of heavy chemicals, or metals, in the interstellar medium, and hence in the clouds of gas
which re-collapse to form new stars. We can see evidence for this in our own galaxy with the
newest and most metal-rich stars are forming in the disk. Older, metal-poor stars, from an earlier
generation, live in the halo. More on how this tells us about galaxy formation in the next part of
the course!
3: STAR DEATH 69

Figure 28. Stellar Evolution; from cradle to grave....and back to the cradle again. Image adapted
from Chandra Educational Resources, NASA.
Galaxies
4: THE MILKY WAY AND OTHER GALAXIES

Figure 29. Wide angle panorama of the Milky Way from Namibia. You can also see the Large and
Small Magellanic Clouds, two dwarf galaxies which are satellites of the Milky Way. Credit : Florian
Breuer

4.1. Outline of content


• The distribution of the stars
• Dust extinction and the true shape of the Milky Way
• Island Universes

4.2. The distribution of the stars

When we look at the sky with the naked eye, we see a few thousand stars spread round the sky,
but we also see a diffuse band angled across the sky - the Milky Way. (See Fig. 29) When we
look with a telescope, we see many more stars than we do with the naked eye, and the diffuse
band at least partly breaks up into stars. What do these simple facts tell us about the system of
stars that we live in? First, lets think about how the apparent brightness of stars changes with
the distance to them.
71
72 INTRODUCTORY ASTROPHYSICS

4.2.1. Fluxes and distances

The luminosity L of a star is the total amount of energy it radiates each second. Suppose the
Earth is at distance D from the star. The energy that left the star has spread out over a sphere
of surface area 4πD2 . The energy per second flowing through each square metre is known as
the flux F . You can see that

L
(4.1) F =
4πD2

The flux tells us the apparent brightness of a star, and it follows an inverse square law - if the
star was twice as far away, it would look four times dimmer. The stars are extremely powerful -
the luminosity of the Sun is L = 4 × 1026 W . On the other hand, they are very far away. The
nearest star, Alpha Cen, is at a distance of 4.37ly19
Exercise 43. From the numbers given at the beginning of the coursebook, calculate the flux of
sunlight at the Earth’s surface. Given that Alpha Centauri is a main sequence star that is very
similar to the Sun, estimate the flux from Alpha Centauri. The pupil of your eye has a diameter
of roughly 5mm. How much energy per second from Alpha Centauri is being collected by your
eye?

How faint a star can we see? This depends not just on the flux F , but on the area A of the
instrument we are using to detect the light - the amount of energy per second we collect, or
the power, is P = A × F . For the naked eye the relevant collecting area is that of the pupil
of the eye, with a diameter of ∼ 5mm. The minimum power the eye can detect is roughly
Pmin = 1.46 × 10−15 W, which then corresponds to a minimum flux of Fmin (eye) = 7.45 × 10−11 W
m−2 . A pair of binoculars however might have an aperture with diameter ∼ 50mm, and so collects
100 times as much light to present to the eye’s detector system - the retina. With binoculars you
can therefore see Fmin = Fmin (eye)/100. A modern telescope might have a mirror that is 4m
across or even bigger, which means you can see Fmin = Fmin (eye)/64000. This is why telescopes
are big - to catch more light and so see fainter things. Of course, on a modern telescope we
don’t observe with our eye; we use a CCD detector, which is much more efficient, and we can
take long exposures. With a good detector and say an hour long exposure we can reach a flux
about a thousand times less.

So how far away can we see a star like the Sun? Turning round our flux formula above, if an
object has luminosity L and we can detect a minimum flux Fmin then we can see our object to a
maximum distance

 1/2
L
(4.2) Dmax =
4πFmin

For a star like the Sun we find:


19
The light year (ly) is a handy distance unit, because astronomical distances are so large. In one year, light travels
a distance 3.00 × 108 × 60 × 60 × 24 × 365 = 9.46 × 1015 m
4: THE MILKY WAY AND OTHER GALAXIES 73

Method Dmax
Naked Eye (diam=5mm) 69 ly
Binoculars (diam=50mm) 690 ly
Big Telescope (Diam=4m, using eye) 55 kly
Big Telescope (CCD, 1 hour exposure) 1.6 Mly

So in principle we could see a star like the Sun over a million light years away. But how far away
are the stars in the Milky Way?
Exercise 44. Check the numbers in the above table.

4.2.2. Star counts vs flux

As we take deeper pictures, i.e. reaching fainter Fmin , we see more and more stars. Does this
make sense? What do we expect to see?

Suppose for the moment that all stars have the same luminosity L. If we have a made a survey
down to flux limit Fmin , then the maximum distance to which we can see such a star is given by
the Dmax formula above. So round the whole sky, we can detect them within a maximum volume

 3/2
4 3 1 L
Vmax = πDmax = 1/2
3 6π Fmin
.

Next, lets assume that the stars are spread through infinite space with a number density of n
stars per unit volume. Substituting for Dmax , the number of stars down a flux limit Fmin is

n −3/2
(4.3) N = n · Vmax (L) = 1/2
· L3/2 · Fmin

This is known as the “3/2” law - when you look to fainter fluxes, the number of stars goes up
as the 3/2 power. What about more luminous stars? They will also follow a F −3/2 law, but with
a different normalisation. As well as different L they will have different number density n - for
example more luminous stars are rarer. Now you can imagine adding up the contributions from
different star types. Working out the overall normalisation is quite tricky, but the shape should
definitely follow F −3/2 . So does it?

Fig. 30 shows star counts in a patch of sky well away from the Milky Way. (We choose a direction
clear of the Milky Way so its easier to count individual stars). This figure shows that at bright
fluxes, the counts roughly follow the expected law, but at fainter fluxes they flatten off. Which
of our assumptions above was wrong? The most likely culprit is the “infinite space” part. If the
Milky Way star system has an edge, or perhaps slowly fades out with distance, then there will be
fewer faint stars than our “3/2” formula predicts.
Exercise 45. Tricky question: The flattening off of star counts seen away from the Milky Way is
not a sharp cut-off - its quite gradual. However from the slopes indicated at the bright and faint
ends in Fig. 30 you can estimate a turnover point. Use this to show that the thickness of the
Milky Way that direction is very roughly 1600 light years.
74 INTRODUCTORY ASTROPHYSICS

Star counts at South Galactic Pole


4
log (number/sq.deg)
2 1/2 s
lope
0

-2

3/
2
slo
pe
-4

-6

-10 -8 -6 -4 -2 0
log [F(star)/F(Alpha Cen)]

Figure 30. Star counts at the South Galactic Pole, i.e. 90◦ away from the Milky Way. The values
represent the number of stars per square degree. brighter than a given flux F but the fluxes are
expressed relative to the observed flux of Alpha Centauri, which is a star very much like the Sun
at a distance of 4.37ly. The open circles are counts from a small patch 6x6◦ , taken from the
SuperCOSMOS sky survey. The filled circles are from a larger region 36 degrees across - the
small patch doesn’t have enough bright stars to count accurately. Even this doesn’t have enough
very bright stars, so the open squares show counts from the all-sky SAO star catalog. However,
this has been adjusted downwards - there are fewer stars towards the Galactic Poles than there are
on average round the sky.

Exercise 46. Think about it : Can you suggest possible reasons why the flattening off of the star
counts is gradual and not sharp?

4.2.3. Star separation

So far we have estimated the distances to the stars, and seen some evidence that, when we
look away from the strip we call the Milky Way, our star system has an edge. But why is the Milky
Way “milky”? Its not truly diffuse. When we look with a pair of binoculars, much of the milkiness
breaks into spots - the stars were just blended together. However, even with binoculars there is
still a milky background. How far apart (in angle) do we expect the stars to be, and why are they
better separated with binoculars than the eye, but still not perfectly separated?

A brief word on angles. For a mixture of historical reasons, and reasons of convenience,
astronomers, like sailors, measure angles in degrees, minutes and seconds. There are 360
degrees in a full circle, 60 arcminutes in a degree, and 60 arcseconds in an arcminute. The
4: THE MILKY WAY AND OTHER GALAXIES 75

standard symbols are θ◦ , θ0 , and θ00 . So, as there are 2π radians in a circle, you can see that one
arcsecond is 4.85 micro-radians. We sometimes need to consider solid angle - the equivalent
of angular area. The whole sphere is 4π steradians. In astronomy, we are normally dealing with
small areas, which are approximately flat. So if a square patch is 500 on a side, its solid angle
is 25 square arcseconds. Note that 1 steradian (abbreviated sr) is (180/π)2 square degrees, so
there are 41,253 square degrees over the whole sky.

The angular distance between stars. Above we saw that stars like the Sun can be seen to
a distance of ∼ 70ly. We also noted that the nearest star, Alpha Centauri, is 4.37ly away. In
fact, the typical distance between stars is about 1 ly. At D = 70ly, a horizontal separation of 1 ly
corresponds to an angular separation of 49 arcmin - almost one degree. Stars we can see with
binoculars are a hundred times further away, and so should be separated by around 30 arcsec.
Stars we can detect with big telescopes will be even further away, and will be separated by less
than an arcsecond. With these kind of separations, can we distinguish separate stars or not?

4.2.4. Angular resolution

The image of a star on our retina, or on our detectors, is not an infinitely small dot, but has a
spread in angle θ, caused by various kinds of blurring. Roughly speaking, if the angular distance
between two stars is closer than that blurring size θ, then we can’t distinguish them. Various
different effects can cause blurring.

(i) Diffraction. Physics tells us that any optical system with aperture diameter D causes a
blurring with angular diameter θ given by

1.22λ
(4.4) θ=
D

where λ is the wavelength of light.


Exercise 47. Show that, for visible light at wavelength 500nm the diffraction limited resolution
for the human eye is roughly 2500 , whereas for a 2m telescope it is 0.0700 .

Note : the formula above is in radians...

(ii) Optical distortions, usually known as aberrations, arising from the fact that no optical system
is perfect. The human eye suffers from an effect known as spherical aberration which in practice
gives it a resolution of ∼ 1 arcminute = 6000 .

(iii) Atmospheric turbulence. This is the effect which astronomers hate most! As light travels
through the atmosphere it gets bent to and fro by refraction in turbulent air cells, which ends up
making a blurred “seeing” image. On a good site this is around 100 across, and on a poor one
may be 500 across.

The net effect is that

• the naked eye is limited by aberration, and can separate things about 1 arcminute apart
76 INTRODUCTORY ASTROPHYSICS

• a ground-based telescope is limited by atmospheric turbulence and can separate things


about 100 apart
• a space-based telescope 2m across, like HST, is limited by diffraction and can separate
things about 0.100 apart.

Above we saw that naked eye stars should be about a degree apart, so we can easily separate
them. Much fainter stars will be much further away, and so closer together on the sky, and so
may not be seen as individual objects - their blended light will add up and produce a diffuse
background. The stars we can detect with binoculars are about 30 arcsec apart. These would
be too close to separate by eye, but can be resolved with the binoculars. However there will be
even more distant/faint stars which blur together and give a milky background. But how much
should the summed-light from those background stars add up to?

4.2.5. The brightness of the sky

How bright should the summed light of all those distant stars be? This calculation was first done
by Olbers and has an unexpected result. If the universe is infinite, the sky should be infinitely
bright! The stars at all distances contribute equally - the increased numbers of more distant stars
exactly counterbalance their reduced brightness.

Consider a spherical shell of thickness dR at distance R from us. If the stars have number
density n, the number of stars in the shell is N = 4πnR2 dR. If each star has luminosity L then
it produces a flux at Earth of F = L/4πR2 . The total flux is then Ftot = N F , and the surface
brightness (the flux of light coming each unit solid angle on the sky, remembering that there are
4π steradians over the whole sky) is

B = Ftot /4π = nLdR/4π

This is the surface brightness caused by that single shell of thickness dR. To get the overall
surface brightness of the sky, we need to add up the light from all the shells at different R.
However note that R has cancelled - the contribution from each shell is just as much, regardless
of how far away it is. So if the system of stars goes on forever, does this mean that the total
amount of light from the sky should be infinite?? Note quite, because eventually distant stars are
physically behind other stars and their light is blocked. Now imagine a random sight-line through
the star system. If the star system is infinite, eventually that sight-line will hit a star. What we
conclude from this is that the whole sky should be as bright as the surface of the Sun. But it
clearly is not. What is going on?

4.3. Dust extinction and the true shape of the Milky Way

So far, we have seen several rather interesting things.

• When we look away from the Milky Way, our star counts flatten off at faint fluxes,
compared to the expectation from the “3/2” law. (See Fig. 30). The most likely explanation
is that we are seeing past the edge of the star system.
4: THE MILKY WAY AND OTHER GALAXIES 77

• In the direction of the Milky Way the starlight blurs together into smooth light, whereas
away from the Milky Way there is no diffuse background - so the star system must be
much thicker in the direction of the Milky Way.
• The Milky Way is a strip that goes all the way round the sky. This suggests that we live
inside a disc of stars.
• The Milky Way looks very patchy (see Fig. 29). Either the stars are distributed in a very
irregular fashion, or something patchy is blocking some of the light.
• The smooth light is nowhere near as bright as the Sun. This suggests that the disc of
stars is not of infinite size, or again, that some of the light has been blocked out.

If we count stars in different directions, remembering that the fainter stars are further away, we
should be able to map out the shape of the Milky Way. This method of star counting was first used
by William Herschel in the eighteenth century. He produced the map shown on the left in Fig. 31,
and this would still be the map you would get today if you did the experiment in a simple-minded
way. It seems to show that we are near the centre of the system, and the structure, while
flattened, doesn’t look like a nice neat disc, but more like a squashed fly. Contrast this with a
modern map made by the Two Micron All Sky Survey (2MASS; see the right panel of Fig. 31).
This is made by surveying the sky in infra-red light (at a wavelength of 2µm), and shows a nice
thin disc, with a central bulge. Furthermore, we are not in the centre, but halfway out. Why are
these maps so different? What did Herschel do wrong? The answer is that he didn’t know about
dust, which is what we need to look at next.

Figure 31. Left : Map of our star system deduced by Herschel from star counts in the eighteenth
century. Right : All-sky map produced by the 2MASS infra-red survey. The patches to the lower
right are the Magellanic Clouds. Credit : IPAC/2MASS project

4.3.1. Dust particles in the interstellar medium

Pictures of the Milky Way in visible light look quite messy, and show dark streaks, as well as
diffuse light. Some patches of sky show very distinct dark patches. This is not due to an
absence of stars. It is because the interstellar medium - i.e. the space between the stars -
is not empty. It contains many small solid particles, which block out some of the background
light. These particles are known as “dust” although a better name might be “soot” or “smoke”,
because they are micron-sized rather than mm-sized, rather like smoke particles. There are
actually two different physical effects - true absorption, and scattering of the light - but even the
scattering removes light from the beam. The net reduction effect is referred to as “extinction”, i.e.
the light is (partially) extinguished. (Not “extincted”...) We will look at the properties of dust more
78 INTRODUCTORY ASTROPHYSICS

carefully in Chapter 6, when we look at the contents of galaxies. Here we will look at its effect on
how far we can see through the Milky Way.

4.3.2. Dust extinction vs distance

The dust is finely sprinkled through interstellar space - so the further a light beam travels, the
more it gets extinguished through absorption and scattering by the dust. Through any small
section of the interstellar medium, there is just a small probability that a photon hits a dust particle
and gets blocked. However, the further the photon travels, the more likely it is to eventually hit a
piece of dust. How do we use this to calculate the amount of extinction with distance?

F-dF F F0

dx x

Figure 32. Illustrating how a light beam reduces in intensity as it travels through the interstellar
medium. See text for details.

Consider a beam of light travelling through the interstellar medium, starting with flux F0 and
gradually reducing in flux as it gets absorbed. 20 Consider a small slab of dust with thickness dx
as shown in Fig. 32. Travelling through the slab the beam loses an amount dF . Our argument
above is that the probability of each photon getting absorbed is proportional to the distance; this
means that
dF
= −kdx
F

where k is an unknown constant. In other words, it is the fractional change in flux which is
proportional to distance. Integrating the equation above we get

ln F = −kx + C

At x = 0 we have the starting flux F = F0 which gives C = ln F0 and so

ln F = ln F0 − kx

This gives us F/F0 = e−kx - the flux reduces exponentially as it travels through the slab. It is
helpful if we define xe = 1/k so that
20
To keep life simple we are assuming the beam is parallel rather than diverging. In reality the usual 1/R2 spreading
out is happening as well.
4: THE MILKY WAY AND OTHER GALAXIES 79

F
(4.5) = e−x/xe
F0

Here the quantity xe is known as the scale length - the distance over which the flux changes by
a factor 1/e, or 0.37. The scale length for the reduction of visible light through the plane of the
Milky Way has been measured by comparing similar stars at different distances. It is roughly
xe = 2.22kly. (Of course it varies rather a lot from place to place; this is an average). Notice that
after three scale lengths light will reduce by roughly a factor 20 (check it...) so thats a reasonable
estimate to use of how far we can see - about 6-7kly. We now know the centre of our star system
is at a distance of roughly 28kly. Only a small fraction of the light can get here from that distance.
So actually we simply can’t see the centre of the Milky Way. We can only see a relatively local
region around us.

4.3.3. Dust extinction vs wavelength

The extinction of light by dust also depends on the wavelength of the light. Longer wavelength
light is extinguished less, and so has a longer scale length. The 2MASS survey was performed
at a wavelength of 2.2µm. Note that most stars emit at least some light in the IR, and smaller
cooler stars actually emit most of their light in the IR, so this is a perfectly good way to map out
starlight. The extinction scale length through the Milky Way at 2.2µm is about 19.8kly. So in the
IR, its quite easy to see clean through the Milky Way.
Exercise 48. Given the information above, estimate the reduction factor suffered by light travelling
between the centre of the Milky Way and the solar system, for (i) visible light and (ii) IR light.

4.3.4. The shape of the Milky Way Galaxy

Star surveys, after correcting for the effects of dust, show us that we really do live in a relatively
thin disc of stars - about 100kly across and 1kly thick; we live about halfway out through this disc.
Meanwhile, another striking feature that we see in the IR map seen in Fig. 31 is that our star
system thickens out towards the centre. This is known as the bulge. Rather than looking like a
squashed fly, as in the Herschel map, our star system is rather like two fried eggs back to back.

The term “Milky Way” is now used to mean the whole system of stars that we live in, rather
than the diffuse band we see stretching across the sky. Our star system is also known as “The
Galaxy”, which comes from the Greek “Galaxias Kiklos” which means “Milky Circle”.

4.4. Island Universes

So is this flat star system the entire Universe? Until the twentieth century this did indeed seem
to be the case. However, dotted around the sky there are various other diffuse structures. These
are known as “Nebulae” which is Latin for “clouds”. It turns out that some of these truly are gas
clouds, and are inside the the Galaxy, but others are actually giant conglomerations of stars. (See
Fig. 33) One of the great achievements of the twentieth century was measuring the distances to
80 INTRODUCTORY ASTROPHYSICS

Figure 33. Left : Orion Nebula, a gas cloud inside the Milky Way (Credit: NASA/HST). Right : The
Andromeda Nebula, an external galaxy outside the Milky Way. (Credit: Jack Newton)

these giant star clusters, and finding that they are completely outside the Milky Way. In Chapter
5 we shall see how this is done.

The very existence of these “island universes” changed our perception of the universe as a
whole, and it stays much the same today. Fig. 34 shows two more of these star-cities, which
look rather like our modern picture of our own Galaxy - a thin disc of stars, in which we can also
see the effects of gas and extinguishing dust in the interstellar medium. Of those two example
galaxies, one happens to be oriented face-on with respect to our line of sight, and one is oriented
edge-on. Unfortunately, with our star system the Milky Way, we don’t have the luxury of travelling
outside it to see it from different angles! In the face-on example we see another interesting
feature - spiral structure in the disc. We will see later that this is common but not universal.

The island universes became known as “galaxies” - with a small g to distinguish them from “The”
Galaxy, i.e. our own. The universe is made up of huge numbers of these galaxies, and we
happen to live inside one of these. The rest of this part of the course is about studying the
properties of these galaxies, including our own.

Figure 34. Left : The galaxy NGC 2903, similar to the Milky Way and seen face-on. Right : NGC
891, a galaxy seen edge-on. (Credit: Russell Croman)
5: THE COSMIC DISTANCE SCALE

5.1. Outline of content


• Distances inside the solar system
• Parallaxes of the nearest stars and the parsec unit
• Stars as standard candles
• Nearby galaxies : Cepheid variables
• Distant galaxies : the supernova method
• Distant galaxies : the Tully Fisher method

To make progress understanding stars, the Milky Way, and distant galaxies, we need to know
how far away they are. Establishing distances in astronomy is a chain of deduction, stepping from
the solar system, to nearby stars, to the Milky Way, to nearby galaxies and to distant galaxies.
This is sometimes known as the Cosmic Distance Ladder.

5.2. Distances inside the solar system

Distances inside the solar system are based on Kepler’s laws, which in turn are based on simple
Newtonian physics. Planets move in slightly elliptical orbits, but we can see the basics by
approximating all planetary orbits as circles. Consider a small mass m orbiting a large central
mass M at radial distance R. We will assume that M >> m, so that the centre of mass, which
will be the centre of the orbit, is at the position of the large mass M . The Sun is 333,000 times
as massive as the Earth, so this is a pretty good approximation.
Exercise 49. A spacecraft approaches the Earth from a large distance. How close does it need
to be to the Earth for the Earth’s gravity to dominate over the Sun’s?

The gravitational force on the mass m is Fgrav = GM m/R2 . Newton’s second law, F = ma tells
us that this force causes an acceleration agrav = GM/R2 . (Note that m cancels - only the large
mass M matters.) Suppose the orbital velocity of our small mass is v. We know that motion in a
circle of radius R at velocity v corresponds to an an acceleration acirc = v 2 /R. If the gravitational
force is what is causing the motion in a circle, then we must have acirc = agrav . So we find that
the orbital velocity must be given by
81
82 INTRODUCTORY ASTROPHYSICS

GM
(5.1) v2 =
R

This is often considered to be the single most important equation in astronomy, because it is
used so often, and is (almost) the only way we have to measure masses. For example, if we
know the radius of the Earth’s orbit RE 21 then we can calculate the orbital velocity of the Earth,
because it takes one year to travel a distance 2πRE . We can then solve for the mass of the Sun.

The planets are at various different distances from the Sun. For now, we don’t know these.
However, what we can easily measure, given enough patience, is the orbital periods of the
planets. For a planet at distance R, the time taken to travel 2πR at velocity v as given above is

1/2
R3

2πR
(5.2) T = = 2π
v GM

This is Kepler’s law - the square of a planet’s orbital period is proportional to the cube of its
distance from the Sun. Given the measured orbital period we could find its distance from the
Sun, if only we knew the mass of the Sun. Another way to approach this is to note that the above
formula applies to the Earth, with R = RE and T = 1 year. From this we find that for each planet

 3  2
R T
=
RE 1 yr

The radius of the Earth’s orbit RE is a fundamental parameter in astronomy and is called the
“Astronomical Unit” : RE = 1A.U. Everything else is known relative to this.

5.2.1. Radar distance to Venus

So we can know all the planetary distances if we can just unlock one! The key measurement is
the distance to Venus. Historically this was measured using a parallax method, but today it can
be measured by radar - by bouncing a signal from Venus and timing the return. If the time taken
is t then the deduced distance is d = ct/2.

In order to get RE from this, we have to make the measurement at just the right time. To get the
idea, examine Fig. 35. The angle θe on the sky between the direction of the Sun and the direction
of Venus is known as the angle of elongation; as the Earth and Sun move this changes of course.
However, when it is at its maximum value, the line joining Earth to Venus is perpendicular to the
line joining Venus and the Sun, and we have a nice easy triangle to solve :

RE = d/ cos θe

21
we will find this shortly - but also see Problem Solving Workshop No. 1
5: THE COSMIC DISTANCE SCALE 83

Orbit of
Venus

Sun

RV

Venus
RE

θe

Earth

Figure 35. Left : Measuring the size of the Earth’s orbit (RE ) using the distance to Venus (d) at the
time when the angle of elongation θe is at maximum. Right : The parallax of a nearby star compared
to more distant stars tells us its distance, if we know RE , the radius of the Earth’s orbit.

The accepted value of the astronomical unit is


RE = 1AU = 1.496 × 1011 m.
Solar system distances cover quite a range, as this table shows:

Objects Average Distance from Sun


Mercury 0.4 AU
Earth 1AU
Jupiter 5AU
Neptune 30AU
Dwarf planets and Kuiper Belt 30-50 AU
Oort Cloud ∼50,000 AU

The Oort cloud has never actually been observed; it is hypothesised as the region from which
the comets come.
Exercise 50. What fraction of the distance to Alpha Centauri is the Oort Cloud? (Hint: link back
to the previous chapter and convert distance units as appropriate)

5.3. Parallaxes of the nearest stars and the parsec unit

Armed with the AU, we can find the distance to the nearest stars. This is because nearby stars
seem to make a small annual angular motion on the sky; as the Earth moves around its orbit the
angle towards the star changes by a small amount, as shown in Fig. 35. This is the phenomenon
of stellar parallax.
84 INTRODUCTORY ASTROPHYSICS

The size of the angular motion is inversely proportional to the distance to the star; very distant
stars will show a negligible effect. This is important, because the only way we can measure the
effect is by following the motion of the nearby star compared to the fixed pattern of background
stars. We can express this either in standard units (radians, metres) or in AU and arcsec :

D(m) = RE (m)/θ(radians) or D(AU) = 206, 265/θ00

The longstanding convention in astronomy however is to define the distance at which a star has
an annual parallax of 1 arcsec as 1 parsec (pc). Then

(5.3) D(pc) = 1/θ00

This definition may seem arbitrary, but historically it was crucial, because of uncertainty in the
value of the AU.
Exercise 51. Think about it : Mediaeval astronomers believed the Earth was at the centre of
the Universe and the Sun went around the Earth. Copernicus proposed that instead the Earth
went around the Sun. Why was the difficulty of measuring parallaxes a problem for Copernican
theory? (The first parallaxes were not measured until 1839.)

We now know the value of 1 AU accurately and so we also know that


1 pc = 3.086 × 1016 m = 3.262 ly.
The nearest known star, Proxima Centauri, is at a distance of 1.30pc, with Alpha Centauri just
slightly further away at 1.34pc. With ground based measurements, measuring a parallax of
less than about 0.200 is very hard, so historically only a small number of stars, within roughly
5pc or 16ly had accurately known distances - 74 known stars to date. The situation was
transfomed by the Hipparcos space mission which could measure parallaxes with an error of
1 mas (milli-arcsec), although only for relatively bright stars. It measured distances to ∼ 20, 000
stars to 10% accuracy, and enabled us to make the HR diagram you saw earlier in the course.
Another revolution is imminent. As I write, the Gaia satellite, which should measure parallaxes
to an error level of 10µas is scanning the sky and should produce results soon.
Exercise 52. The parallax of a particular star is measured by Hipparcos to 5% accuracy with a
1 mas error. How many times further away than Alpha Centauri is this star?

5.4. Stars as standard candles

5.4.1. Distances from individual star types

Our distances so far have relied either on Newton’s laws (planetary orbits) or timing (Venus
radar) or geometry (parallax). Now we come to the crucial idea of the “standard candle”. If we
know the true luminosity L of an object and measure its flux F , then we can deduce the distance
D = (L/4πF )1/2 . How do we know the true luminosity of a star? Essentially by locating it on
the HR diagram. (Check back to Fig. 18 in Chapter 2). The simplest method is to measure the
colour of the star, using this to estimate its temperature. Then we can take a vertical line until
5: THE COSMIC DISTANCE SCALE 85

we hit the main sequence, then a horizontal line to read off the luminosity. Because colours are
quite easy to measure even for faint stars, we can use this to get rough distances to many stars,
and to large distances across the Milky Way. However, there are two problems.

(i) Colour doesn’t uniquely identify the type of star. If we see a blue star, it might be a massive
blue main sequence star - or it might be a tiny hot white dwarf.

(ii) Dust between us and the star will reduce the flux, and make us think the star is further away
than it really is. Furthermore, because dust extinction depends on the wavelength of light, the
blue light is extinguished more than the red light, and so the star appears the wrong colour.

Both these problems can be gotten round by detailed spectroscopy of stars, ie. by breaking the
light up into its full spectrum - if we see specific spectral lines, we can be more sure of what type
of star it is. However, getting a good spectrum is much harder than simple colour measurement,
so in practice we don’t often do this.

Exercise 53. From its spectrum, a star appears to be a ten solar mass star. When we measure
its flux, it seems to be six times fainter than Alpha Centauri. Estimate the distance of this star.
(Hint : Remember the formula for how luminosity varies with mass, and note that Alpha Centauri
is very similar to the Sun).

5.4.2. Cluster sliding fits

We can get a more accurate distance estimate, and get round problem (i) - the ambiguity of what
type of star we are really looking at - by making colour and flux measurements for all the stars
in a cluster of stars, and making the HR diagram for that cluster. This assumes that to a good
approximation, all the stars in that cluster are at the same distance from us. Then comparing the
HR diagram to the standard one, we can work out the distance to the cluster. Alternatively, we
can compare one cluster to another - by doing a kind of sliding fit, as illustrated in Fig. 36, we
can find the ratio of fluxes in one cluster compared to the other. We then know the ratio of their
distances, even if we don’t know their absolute distances. Historically, this was important; lots of
distances were tied to the distance to the nearby Hyades star cluster.

Exercise 54. From Fig. 36, it seems the stars in the Pleiades are systematically ∼ 7 times fainter
than the stars in the Hyades. What is the ratio of the Pleiades distance to the Hyades distance?

5.4.3. The scale of the Milky Way

To get some perspective, lets have a look at the distances found for some famous objects, and
the overall scale of the Milky Way. Note however that the Milky Way does not have a hard edge
- it fades out gradually, as we discussed in Chapter 4.
86 INTRODUCTORY ASTROPHYSICS

Apparent flux of Star


Pleiades

Hyades
blue red
Star colour

Figure 36. Measuring the distances to clusters. The left image shows the Hyades and Pleiades
clusters, which can both be seen with the naked eye, and are about 13◦ apart on the sky. The right
image shows their HR diagrams.

Object Distance / size


Nearest star: Proxima/Alpha Centauri 1.3pc
Brightest star in sky: Sirius 2.6pc
Pleiades star cluster 133 pc
Star forming region: Orion Nebula 389 pc
Deneb 800pc
Centre of the Galaxy 8.3 kpc
Diameter of Milky Way ∼ 30 kpc

5.5. Nearby galaxies : Cepheid variables

5.5.1. Can we see stars in external galaxies?

The current accepted distance to M31, the Andromeda Nebula, which is the nearest galaxy of
a similar size to the Milky Way, is 778kpc. (The Milky Way has some dwarf satellites, such as
the Magellanic Clouds, which are closer). At this distance, a star like the Sun would be almost
a trillion times fainter than Alpha Centauri. It is possible to detect things this faint, but you need
a long exposure on a big telescope; and furthermore such a star would be hopelessly blended
with its neighbours.
Exercise 55. Recalling that stars are about 1 ly or 0.3pc apart, what is the typical angular
separation of stars in a galaxy like the Milky Way, at a distance of 1Mpc? Consider a box of
one arcsecond diameter on the sky, looking at the distance of a face-on Milky Way-like galaxy.
Estimate very roughly how many stars might be inside that box.

The most luminous stars can be much brighter - remember that for main sequence stars L ∝ M 4
and the biggest stars are ∼ 50M . Furthermore, some stars - such as the Cepheids discussed
below - can be even more luminous at certain stages in their evolution, and so stand out as
bright spots on top of the general starlight background. Such stars are very rare, but this helps,
5: THE COSMIC DISTANCE SCALE 87

because they are well separated in the image of a galaxy. But how do we know we have seen
such a rare star, as opposed to, for example, an unresolved cluster of perhaps thousands of
stars?

5.5.2. Pulsating variables

The easiest stars to spot are pulsating variables, which change in brightness periodically over
10–100 days. Various star types do this, but the most famous and useful examples are the
Cepheid variables. Pulsating variables were discussed in Chapter 3. The key thing is the
period-luminosity relationship shown in Fig. 21.

Consider taking repeated images of the same galaxy. Bright spots in the image correspond to
extremely luminous stars standing out above the general background. You notice that one of
these varies in brightness periodically. If you measure the period P , you can locate that value of
P on figure 21, and read upwards to find the luminosity L of the kind of star has that period P .
So now you know the true luminosity L of the star. If you now measure its apparent brightness,
the flux F , then we can get the distance to that luminous star as D = (L/4πF )1/2 .

Rather than using the plotted data in Fig. 21 empirically, it can be useful to have a mathematical
approximation to the relationship. This is what you worked out in Exercise 32.

In the 1920s, Hubble measured Cepheids in M31 and M33 and determined their distances,
which cracked open the extragalactic distance scale. The modern values are about 780kpc and
900kpc. With the Hubble Space Telescope, we can now spot individual Cepheid variables in a
few tens of nearby galaxies - not many, but enough to give the next steps in the distance a firm
foundation. Most of the galaxies with Cepheid distances are within about 5Mpc, but the most
distant galaxy so far with a Cepheid distance is NGC 3370, at a distance of 29Mpc.

5.6. Distant galaxies : the supernova method

At the end of their lives, some stars die gracefully and some explode catastrophically as supernovae22.
Supernovae are rare - very roughly speaking, we expect about one supernova explosion per
galaxy every few hundred years, and indeed only a handful have been seen in the Milky Way in
recorded history. However, we see them go off in external galaxies quite often. If we watch any
single galaxy, we would have to wait a long time; but if we keep watching thousands of galaxies,
we will catch one quite often. For a month or two, a supernova can be almost as luminous as the
entire galaxy it happens in - so they are easy to spot when they happen. An example is shown in
Fig. 37. But are they standard candles - or are they all different? It turns out there are two basic
kinds of supernova.

Type II are “core collapse” supernovae. As explained in Chapter 3, these are massive stars
which finally run out of nuclear fuel, causing a collapse of the core which in turn generates a
huge amount of energy. They are horribly complicated, and so no good as standard candles.
Type I (in particular Type 1a) are “white dwarf bombs”. As explained in Chapter 3, because
this involves the conflagration of a White Dwarf exactly at its Chandrasekhar limit, the amount
22
Note that its one supernova, two supernovae - pronounced supernovee, more or less. Americans sometimes use
“supernovas” as the plural. This is ok, but “several supernova” is definitely wrong!
88 INTRODUCTORY ASTROPHYSICS

of energy released is always the same, so they make excellent standard candles. The peak
luminosity of a Type 1a supernova is roughly 4 billion times that of the Sun. This means it can
be easily seen with large telescopes to distances of several hundred Mpc.

log luminosity
km/s

Figure 37. Left : a before and after picture showing a supernova in the galaxy M100. (Credit :
NASA/SWIFT) Right : The Tully-Fisher relationship between luminosity and rotation speed (From
Brent Tully’s Scholarpedia page)

5.7. Distant galaxies : the Tully-Fisher galaxy rotation method

Spiral galaxies rotate. In Chapter 7 we will look more closely at galaxy dynamics, and learn how
we actually make velocity measurements. For the moment we will just assume we can measure
the rotation speed V . We can use this to estimate the mass of the galaxy, and use that in turn to
estimate the luminosity of the galaxy. Then, as ever, once we have the estimated luminosity and
the measured flux, we can deduce the distance of the galaxy. When we look at how this works,
it involves some guesses and approximations... so we end up having to calibrate the method
empirically.

A given galaxy has a size R and a rotation speed V , which is given as usual by V 2 = GM/R.
This gives us the mass of the galaxy M = RV 2 /G. The luminosity of the galaxy should be
proportional to its mass, but the constant of proportionality isn’t immediately obvious, because it
depends on what mixture stars it has, and whatever non-stellar material there may be. Lets just
say L = µM for now. So we get

L = µRV 2 /G

However, there is a connection between L and R which enables us to get rid of R. It turns out
that the main reason galaxies have different luminosities is because they are different sizes - to
a first approximation, galaxies all have a pretty similar surface brightness, but some are bigger
than others, and so have a larger total luminosity. Again, that proportionality isn’t obvious a priori
- lets just write L = kBR2 where B is the average surface brightness, and k is some unknown
scaling constant. We can solve for R in terms of L and put that in the equation above, and after
a litle juggling we have
5: THE COSMIC DISTANCE SCALE 89

µ2
L= V4
G2 kB

So on the one hand, this makes a clear and strong prediction - that luminosity scales as the 4th
power of rotation speed. On the other hand, we don’t know µ, k, and B. So what we do is to
try plotting L against V for galaxies for which we think we have some other estimate of distance.
This is shown in Fig. 37. You can see that the V 4 prediction works well, and furthermore,
because the scatter is not too bad, whatever the values of µ, k, and B are, it looks Nature has
been kind and they are always pretty much the same. Armed with this, if we measure V for some
new galaxy, we can read off the predicted L from the empirical graph.
Exercise 56. Think about it : What are the relative advantages and disadvantages of the
supernova method and the Tully-Fisher method?

5.8. The scale of the extragalactic universe

Just as ∼ 1pc is about the typical distance between stars inside a galaxy, it seems that ∼ 1Mpc,
or maybe half that, is a typical distance between large galaxies. (As we shall see later, small
galaxies are more common.) The Cepheid method gets us galaxy distances to 10Mpc or a
maybe a few tens of Mpc. The supernova and Tully-Fisher methods can give us estimates out to
hundreds of Mpc.

As you probably know, and we shall discuss properly later, it turns out that galaxies are systematically
rushing away from us, which shows us that the Universe as a whole is expanding. With our
supernova and T-F distances, we can calibrate the rate of apparent expansion velocity versus
distance. This gives us a final method to get very rough distances; if we measure the recession
velocity of a galaxy, we know how far away it must be. This enables us to estimate galaxy
distances on scales of many Gpc.

As far as we can tell so far, there are just more and more galaxies, the further we look. There
is no sign of some kind of “MetaGalaxy”. However, this does not mean that the galaxies are
distributed smoothly... this is an issue we will come back to in Chapter 9.
6: THE CONTENTS OF GALAXIES

6.1. Outline of content


• Stars of different sizes
• Star lifetimes and the age of the Milky Way
• Gas
• Dust
• Local mass census

What do galaxies contain? More than just stars! In this section we will look at the contents of
galaxies, and how we know what’s there. Much of what we know in detail comes from our studies
inside our own galaxy, the Milky Way, but what we see is typical of external galaxies too. Not all
galaxies have the same mix of contents, but the types of things that are present are all seen in
the Milky Way.

6.2. Stars of different sizes

Most obviously, galaxies contain many stars. However, we have seen in the “Stars” section of
the course how stars come in a range of masses and luminosities. Which is more common? The
big ones or the little ones? Estimating the mass of a random star is usually difficult, so lets start
by looking at stellar luminosities.

6.2.1. The stellar luminosity function

What we would like to know is the stellar luminosity function φ(L) - the number of stars per unit
luminosity per unit volume of space. You might think that all we need to do is map out the sky,
estimate the luminosity L of each star we find in our survey, and count the observed luminosity
distribution N (L). However, the observed N (L) gives us the wrong answer. The green histogram
in Fig. 38 shows the observed N (L) for all 9110 stars you can see with the naked eye. Most
stars we see in the sky are more luminous than the Sun - typically 100 times more luminous.
The red histogram however shows the luminosities of the 356 known stars that are within 10pc
91
92 INTRODUCTORY ASTROPHYSICS

distance of the Sun. Nearly all of these stars are less luminous than the Sun - typically 100 times
less. Why do these two experiments give such different results?

observed N(L) Nearby stars vs Naked Eye stars Stellar Luminosity function
8
0.30 Stars within 10pc α=-1.35
Naked
6 eye stars
0.25

log(φ(L)) + const
4
Normalised count

2
0.20 nearest ! naked eye!
stars stars 0 L=630Lsun
0.15 (M=5MSun)
-2

0.10 -4

-6
0.05
-8

0.00 -10
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 -8 -6 -4 -2 0 2 4 6
log(L/Lsun) log(L/Lsun)

Figure 38. Left : The observed stellar luminosity distribution. The green (open) histogram shows
the observed distribution from the Yale Bright Star Catalog, essentially all naked eye stars. The
red (solid) histogram shows the distribution of all known stars within 10pc, from the Zhakozaj 1979
catalog. Right : The field stellar luminosity function, after correction for the Vmax effect discussed
in the text. The line shown is φ(L) ∝ L−α with α = 1.35. Data taken from textbook by Binney and
Merrifield

The Vmax effect. The reason for the difference between the red and green histograms in Fig. 38
is that more luminous stars can be seen much further away, so a catalogue of stars to a fixed
apparent brightness - for example the sensitivity limit of our eyes - is very misleading. In section
4.2.2 we showed that if we survey stars of luminosity L to a flux F the number we get is

 3/2
1 L
(6.1) N (L) = φ(L) × Vmax (L) where Vmax (L) = 1/2
6π F
Exercise 57. How does the maximum survey volume for a ten solar mass star compare to that
of a solar mass star? (Hint : recall how luminosity varies with mass.)

A survey to fixed depth F will overestimate the real number of luminous stars compared to
weaker ones by a considerable factor. There are two ways we can overcome this bias.

(i) We can try to find all the stars to a fixed distance, regardless of how faint they are. However,
this is very hard work, as the smallest stars are very faint, even when they are close. Nearly all of
the stars shown in the red histogram are invisible to the naked eye, and even with big telescopes,
many of them have only been discovered in modern times. Even today, the number at the lowest
luminosities has almost certainly been underestimated because there are small nearby stars we
still haven’t found.

(ii) We measure all the stars down to a fixed F , but then we use equation 6.1 to correct for the
bias. Consider stars of true luminosity L. They have a true number density in space φ(L), and
can be seen in our survey within a volume Vmax (L). Therefore, the number we should count is

N (L) = φ(L) × Vmax (L) which ∝ L3/2 φ(L)


6: THE CONTENTS OF GALAXIES 93

From the observed N (L) we can therefore deduce the true φ(L). The result from classic star
surveys is shown in the right-hand side of Fig. 38. Over a large range, a reasonable fit is
φ(L) ∝ L−α with α = 1.35. The steeper slope at high L is important, as we discuss in section
6.2.4. But first, how do we turn our luminosity function into a mass function?

6.2.2. The stellar mass function

Above we have seen how to measure the luminosity function φ(L) - the number of stars per unit
volume per unit luminosity. We would also like to know the mass function n(M ) - the number of
stars per unit volume per unit mass. Because we already know how L varies with M - roughly
L ∝ M 4 - changing from one function to the other is just a question of maths. However, changing
variables for density functions can be confusing, so lets step through it carefully.

Lets assume we are looking at unit volume in both cases, so we can just concentrate on the
per-unit-luminosity and per-unit-mass bits. Consider a small range of luminosity ∆L. Given that
the density per unit luminosity is φ(L), the number of stars we will find is N = φ(L)∆L. Next,
we ask, what range of mass ∆M does that range ∆L correspond to? Because we know L as a
function of M , that must be

dM ∆L
∆M = ∆L · =
dL dL/dM

That range ∆M must give us the same observed number, i.e. N = n(M )∆M = φ(L)∆L. If you
substitute for ∆M from above, the ∆L cancels out and we get

dL
n(M ) = φ(L) ·
dM

You can get to this a little quicker if you think in terms of differentials and just remember that
n(M )dM = φ(L)dL. It is a completely general formula for converting between density functions.
What does it tell us in our specific case?

The quantities we actually have - the luminosity function, the mass function, and the mass-luminosity
relationship - are all empirical, numerical, quantities. From theory, we might be able to predict
what their expected mathematical form should be, but this is hard. What we do in practice
is to use a mathematical shape that we think is a good approximation. The most common
assumption is to approximate all these functions as power laws. So for example, we assume
that the mass-luminosity relationship is L ∝ M β , and we find that β ∼ 4.0. The luminosity
function we found empirically to be increasing towards small luminosities, so we approximate
this as φ ∝ L−α , and find that α ∼ 1.35.
Exercise 58. Use the results above to show that n(M ) ∝ M γ with γ = β(1 − α) − 1

With β = 4 and α = 1.35 we get the slope of the mass function to be γ = −2.4. This is pretty
close to the standard modern answer, γ = −2.35. It is quite a steep negative slope - low mass
stars are much more common than high mass stars.
94 INTRODUCTORY ASTROPHYSICS

6.2.3. Integrated Mass versus Integrated Light

If we take the power law function n(M ) =∝ M −2.4 seriously, and carry on down towards zero
mass, there seems to be infinitely many dwarf stars, which is a little disturbing. However, there
is a lower limit to the mass of stars which can start nuclear burning - about 0.08M . Below
this there are brown dwarfs, which are very hard to find and measure. Indications from modern
research so far is that even including brown dwarfs, the mass function peaks and turns over, so
we do after all have a finite number of very small star-like objects.

But what is the total mass? Is this finite? To see how to get this, consider a small mass range
dM centred on mass M . The number of stars per unit mass per unit volume is n(M ) but each of
those stars has mass M so the overall mass contained in the range dM is M × n(M )dM , which
goes M γ+1 = M −1.4 . Then if we integrate that function, we can find the total mass, i.e. below
some M∗ or above M∗ , and find

M(< M∗ ) ∝ [M −0.4 ]0M∗ or M(> M∗ ) ∝ [M −0.4 ]M



The mass above M∗ converges. The mass below M∗ diverges, i.e. it would be infinite, but
luckily the mass function n(M ) is not really a power-law, as we discussed above. The low-mass
turnover in the brown-dwarf regime means that the total mass is finite. However, if we look at
the integrated mass above and below one solar mass, the mass is in stars is dominated by low
mass stars - see the exercise below.

With light it is a different story. Although there are more low luminosity stars than high luminosity
stars, with φ(L) ∝ L−1.35 , the bigger ones emit more light each. For stars of a given luminosity L,
the amount of light emitted by such stars is L × φ(L) ∝ L−0.35 . If we integrate this, it is convergent
towards low luminosities, but divergent towards high luminosities - so it would seem an infinite
amount of light is emitted by the biggest stars. However, the biggest known stars are roughly
60M - above this, radiation pressure will blow a star apart. So in fact the total light above a
solar mass is finite.
Exercise 59. Consider stars in two ranges : from 0.1 to 1 solar masses, and from 1 to 10 solar
masses. Using our standard approximations, calculate the ratio of (a) the integrated masses in
each range, and (b) the ratio of the integrated luminosities in each range. (Hint : be careful what
you are integrating...)

As we see from the above exercise, we have the curious situation that for the Milky Way most of
the mass is in small stars, whereas most of the light is coming from big stars.

6.2.4. The lifetime of stars and the age of the Milky Way

As well as having differing masses, stars will have been formed at different times, so as we see
them today will have differing ages. But as we saw in the “Stars” section of the course, stars
evolve and change their properties. They spend a long time on the main sequence, and then
they evolve - for example high mass stars end up exploding as supernovae, and stars like the
Sun swell up as red giants, and end up leaving only a white dwarf behind. We also saw that
massive stars go through their evolution quickly, whereas low-mass stars last much longer. This
6: THE CONTENTS OF GALAXIES 95

means that the mass function we see today is not the same as the mass function with which the
stars formed - the older high mass stars have disappeared, whereas the older low-mass stars
are still here.

We can see this effect if we look at the stars in a young cluster like the Pleiades (see for example
Fig. 36). The number of stars versus luminosity seems to follow L−1.35 . This is consistent with
the result we saw by making a survey of field stars. However, in the field stars, that slope only
applies up to about 5M ; above that, the luminosity function is steeper, whereas in the Pleiades
the same slope continues. Now, we don’t really understand what causes that special 1.35 slope,
but we do think that all stars originally formed in clusters like the Pleiades, but have gradually
wandered away from their birthplaces. If the general field stars show a 1.35 slope at low masses,
but not at high masses, it must be because at least some of the original high mass stars have
disappeared. From this fact, we can learn about the history of the Milky Way. But first we need
to understand stellar lifetimes.

Solar lifetime. Suppose a fraction f of the Sun’s mass can be burned by nuclear fusion, and
that fusion turns mass to energy mc2 with efficiency e. Then the total amount of energy the Sun
can make is E = ef M c2 . This is being released from the Sun at a rate L ; so the nuclear
burning lifetime of the Sun is t = E /L . Because the Sun is dominated by Hydrogen to Helium
burning, we know e = 0.007, and we believe that the nuclear burning core of the Sun is 10% of
its mass, so f ∼ 0.1.
Exercise 60. Show that the lifetime of the Sun should be roughly 10.4Gyr.

More careful analysis gives a solar lifetime of 10.8Gyr. The age of the oldest rocks on Earth is
roughly 4.5Gyr. If the Earth formed very soon after the Sun, this would seem to suggest that the
Sun is halfway through its life.

Stellar lifetime vs mass. As we have seen, L ∝ M 4 . All else being equal, the nuclear lifetime
would then vary as t∗ = E/L ∝ M −3 . In fact, higher mass stars are better at burning their mass,
so that the burnable fraction f varies along the mass range, with f ∝ M . The net effect is that

(6.2) t∗ = 10.8(M∗ /M )−2 Gyr

So a 0.1M M-dwarf should last a trillion years, whereas a 50M blue supergiant will last only a
few million years

How old is the Milky Way? Suppose the age of the Milky Way is T and the lifetime of some
star of mass M∗ is t∗ . Next, suppose that the Milky Way formed all its stars in one go. Then for
any star with t∗ < T , there should be none left - for example if the Milky Way is 100Gyr old there
should be no stars like the Sun at all. This doesn’t seem right - we don’t see a cliff-like drop off
in the luminosity function, but a more gradual turn down.

Suppose instead that the Milky Way has been continously making new stars in a steady trickle
since its formation. For any stars with long lifetimes, t∗ > T , they have been gradually accumulating
over the life of the Milky Way, i.e. over the whole time T . However for stars with t∗ < T , only
those formed during the last t∗ years will still be here - older ones have disappeared. The fraction
of all the M∗ stars ever made that are still here will be t∗ /T .
96 INTRODUCTORY ASTROPHYSICS

Examining Fig. 38, we can see that stars with ∼ 5M , which have L ∼ 630L , are below the
extrapolation of our α = 1.35 curve by about a factor ten. A reasonable guess is that the α = 1.35
slope is intrinsic - lower luminosity stars are more common - whereas the extra dip is due to the
finite-age drop-off. Our 5M stars have a lifetime of ∼ 0.4Gyr; if only 10% of them are still here,
that suggests that the age of the Milky Way is ten times as long - about T ∼ 4Gyr.

Our simple analysis has given an age for the Milky Way which is quite similar to the age of the
solar system. A more careful analysis gives an answer of ∼ 8 − 9 Gyr - this is still an active
research topic. Note that what we have is an estimate of the age of the Milky Way disc. As we
shall see later, some parts of the Milky Way are older, and the Universe as a whole is older still -
but not by much.
Exercise 61. A more challenging question... In the above calculations, we assumed that star
formation has been constant since the formation of the Milky Way. For T = 4 Gyr and 5M
stars with lifetime t∗ = 0.4Gyr, this means that 10% of all such 5M stars formed are still here.
Suppose instead we assume that star formation has declined exponentially with a timescale of
te = 3 Gyr. What percentage of all the 5M stars ever made would still be here? (Hint : define
the star formation rate as a function of time S(t) and integrate).

The space between the stars in our Galaxy is not empty - it contains gas, small solid particles
known as dust, and relativistic plasma, collectively known as the interstellar medium (ISM). In
this section we will discuss the gas. This has multiple components or phases - the ISM is made
of interlocking bubbles and strands of material of different densities and temperatures.

6.2.5. Neutral Hydrogen

The majority of the gas in the Milky Way and in most but not all external galaxies is in the form
of simple neutral Hydrogen. It is very cold, typically in the range 20-50K. It is optically invisible
but easy to detect in the radio, because it emits a radio frequency spectral line at λ = 21cm.
How does this happen? In Section 1.3 we looked at the energy levels of Hydrogen. The ground
state, n = 1, is actually split into two very close “hyperfine” states with ∆E = 5.8 × 10−6 eV. This
is due to the spins of the proton and the electron. These spinning charges are effectively tiny
magnets; like all magnets their tendency is align with N-pole opposite S-pole. To turn one of the
magnets to force it to be parallel i.e. with the magnetic axis pointing the same way as the other
magnet, you would need to put work in. So for the proton and electron the parallel state has
slightly higher energy than the anti-parallel state. Like everything else in the atomic world, this
energy difference is quantised - there are simply two distinct energy states.

Collisions with other atoms frequently bump up the Hydrogen atom into the upper energy state,
but it quite soon drops back down and emits a photon. The energy of the photon is given by
λ = hc/∆E = 21.37cm23. This is a very macroscopic value - its a radio wave at a very specific
frequency, which is easy to recognise and map out with our radio telescopes.
Exercise 62. In a gas at temperature T the typical particle energy is E = 3kT /2. How hot does
a gas have to be for collisions which excite the Hydrogen ground state to be common?

As we see from the exercise above, it is easy to excite the hyperfine electron state - collisions
with other atoms make this happen all the time. On the other hand, almost as soon as the
23
check this, but watch out for units!
6: THE CONTENTS OF GALAXIES 97

state is excited, it de-excites by radiation. The result is an equilibrium determined just by the
laws of quantum mechanics that define the hyperfine state, not by local conditions such as the
temperature or pressure or gas density; so the fraction of Hydrogen atoms in the excited state
is always about the same, and the rate of photon emission per atom is also always pretty much
the same. This is very useful. It means that the total 21cm line luminosity from a region is just
proportional to the number of H atoms it contains - so we can get an estimate of the gas mass.

6.2.6. Molecular gas

Hydrogen atoms have a tendency to stick together in the molecular form H2 . Starlight can
easily split apart the molecular bond, so in most of the ISM the Hydrogen is in the atomic form.
However in cold dense regions the starlight can be blocked out and molecules form - not just
H2 but more complex molecules such as CO, HCN , H2 O, C2 H5 OH, and so on. How do we
know these molecules are there? Its because we see radio frequency and far-IR emission lines
corresponding to changes in energy state. In this case, the energy concerned is to do with the
rotation and vibration of the molecules. As usual, these energy states are quantised; they can be
excited by collisions; and then they decay giving emission lines at radio and far-IR frequencies.
The result is an extra-ordinarily rich radio frequency spectrum.

Molecular Hydrogen, although it is most of the mass of molecular gas, is quite hard to detect
because it has few strong transitions. Many of the complex molecules, and especially CO, are
much easier to detect, and act as tracers of material we otherwise might not know about. It
is also intrinsically interesting that such a rich chemistry occurs in dark clouds. Some people
speculate this may be a necessary stage in building the materials necessary for life.

6.2.7. Ionised gas

Although most gas in galaxies is cold and optically invisible, some gas can be seen in visible
light pictures. Such features are historically known as “nebulae”24 which is Latin for “clouds”.
Some nebulae are connected with the late stages of stellar evolution - supernova remnants, and
planetary nebulae, which are nothing to do with planets, but are caused by the expansion of the
outer parts of stars. However, many nebulae are connected with regions of new star formation
- the classic example is the Orion nebula. The optical spectra of such regions show emission
lines corresponding to transitions between the standard atomic energy levels. (An example is
shown in Fig. 39). This occurs because the gas is ionised by the light from nearby stars; every
so often an ion (usually just a Hydrogen nucleus, i.e. a proton, but sometimes a heavier element)
captures an electron to make a neutral but excited atom, which then cascades down the energy
levels.

This trick, of ionising gas, can only be done by hot stars. As we learned in Section 1.3, the
ionisation energy for Hydrogen is 13.6 eV=2.18 × 10−18 J. A photon with this much energy has
wavelength λ = hc/E = 91nm - in the UV. What sort of star would peak at this wavelength?
Equation 1.45 tells us that a blackbody at temperature T peaks at a wavelength given by

λT = 2.9 × 10−3 mK
24
Another historic plural form. Like with supernova/supernovae, we have one nebula and many nebulae
98 INTRODUCTORY ASTROPHYSICS

Figure 39. Optical spectrum of the Eta Carina nebula. The Hα line corresponds to Hydrogen
changing between n = 3 and n = 2, and Hβ between n = 4 and n = 2 etc. The lines labelled [OIII]
correspond to transitions in doubly ionised Oxygen.

so a peak wavelength of 91nm suggests a temperature T ∼ 32000K. Only the most massive
stars are this hot - stars of around 50M . Stars of somewhat lower mass still have significant
UV, but rapidly less so as you go down in mass, so ionisation is dominated by stars over about
25M . However, as we saw in section 6.2.4, such massive stars have very short lifetimes - only
a few million years. So the ionised gas, as well as the blue light from stars, picks out regions of
very recent star formation.
Exercise 63. Roughly speaking, the lowest surface temperature a star can have and give a
reasonable amount of ionisation is T ∼ 10, 000K. This is about the temperature of the bright
star Vega (Alpha Lyrae), a main sequence star with T = 9602. Vega has mass 2.135M and
luminosity 40.12L . If Vega can burn 10% of its mass, estimate its lifetime, and thus the oldest a
nearby ionised gas region can be.

6.3. Dust

We met the idea of dust extinction previously in the Milky Way section - very fine solid particles
that partially block out light, causing the patchy appearance of the Milky Way. The densest clouds
can cause patches of almost complete blackness, as shown on the left of Fig. 40. However, we
can also see dust glowing in emission. The energy of starlight that dust particles are absorbing
in the visible light regime has to come out somewhere else. The absorbed light heats the dust
particles and they then radiate as blackbodies (they are not quite perfect blackbodies, but never
mind). We can see this warm dust emission spread throughout the Milky Way. The right hand
side of Fig. 40 is a map of part of the Milky Way made from far-IR observations. Why is the dust
particularly visible when viewing at far-IR wavelengths? To understand this, we need to think
about how material between the stars is heated, and how it radiates.

There are two different ways to approach this calculation, depending on what we assume about
the object being heated. First we will assume it is a thick flat slab, of area A, at a distance D
6: THE CONTENTS OF GALAXIES 99

Figure 40. Dust in absorption and emission. Left : visible light image of Barnards Nebula, showing
a very dark patch caused by dust extinction in a dense cloud. (Credit : FORS team/ESO). Right :
Far-IR map of the Galactic Centre region, made by the IRAS Satellite. Here everything we see is
dust emission. (Credit : NASA)

from a star of luminosity L. (See left hand side of Fig. 41.) At the position of the slab, the flux
(energy per sq. m) is F = L/4πD2 . The power being absorbed by the slab is therefore

L
Pabs = A ·
4πD2

We assume that this energy heats up a thin surface layer. When the surface gets to temperature
T , it will radiate as a blackbody, emitting power

Pemit = AσT 4

(You may recall that a spherical surface emits luminosity L = 4πR2 σT 4 ; the spherical surface
has an area A = 4πR2 ). The surface will keep warming up until Pabs = Pemit . The area A then
cancels out, and we find that the equilibrium temperature will be

 1/4
L
Teq =
4πD2 σ

If we take a random spot between stars in the ISM, the dust is very cold, around 10K. However the
dust near luminous stars can be much warmer, and also therefore much brighter. If for example
we take a massive star with L ∼ 105 L then at a distance D = 0.1pc we get T = 49K. Using λT =
2.9 × 10−3 mK we see that such dust will radiate at λ = 60µm, in the far-IR. The emission from
dust near those luminous stars will usually outshine the colder dust. Far-IR emission therefore,
like blue light and ionised gas, is a tracer of recent star formation in galaxies. Overall, in galaxies
with star forming discs, like the Milky Way, about 10% of the light gets hoovered up by dust and
re-emitted in the IR. There are also rare objects known as starburst galaxies which emit most of
their light in the IR. We will meet these again later.
Exercise 64. Above we assumed the heating effect is on a flat slab facing the heating star.
Suppose instead we assume that the dust grain is a small sphere, so that the heating effect
is spread through the body, and the whole surface of the grain radiates at the warmed-up
100 INTRODUCTORY ASTROPHYSICS

slab area A

star!
luminosity L

(a) slab model (b) sphere model

Figure 41. Two models of how interstellar material is heated. On the left, we assume a thick flat
slab of material, with just the surface being heated. The same area radiates the energy back into
space. On the right, we assume a small spherical dust grain. The absorbed energy is spread
through the body of the grain, and is radiated back into space by the whole surface area.

temperature. (See right hand side of Fig. 41.) With these assumptions, derive a revised version
of the equilibrium temperature formula. (Be careful about effective absorbing area, and the
radiating area). Can you think of another important effect have we missed out?

What are these dust grains? They seem to be a mixture of carbon-based grains (e.g. graphite)
and Silicates (e.g. Silicon dioxide) - i.e. pretty much like soot and sand. In some ways, “smoke”
would be a better name than “dust”. Here are some size comparisons :

• Atoms : 1nm
• Proteins : 30nm
• Cosmic dust : 1µm
• Smoke : 1µm
• Silt : 10µm
• Car Exhaust : 30µm
• Fine sand : 100µm
• Coarse sand : 1mm

You can see that the size of cosmic dust grains is similar to the wavelength of visible light, which
is why extinction is so sensitive to wavelength - much larger waves sail past and hardly notice
the dust; much smaller waves are simply blocked.

Exercise 65. Think about it: sometimes we see blue light coming from clouds in dusty regions.
Can you guess why this might be?
6: THE CONTENTS OF GALAXIES 101

6.4. Local Mass Census

How much material is there in the Milky Way, and is it dominated by stars, or gas, or what? The
answer for the Milky Way, and for other galaxies, depends on your location within the galaxy. In
the solar neighbourhood (a few tens of parsecs around the Sun) we find that the total matter
content is roughly 0.1 M pc−3 . The breakdown by type is :

• Main Sequence stars : 31%


• Giant (evolved) stars : 6%
• White dwarfs and Brown dwarfs : 13%
• Neutral gas : 29%
• Molecular gas : 21%

It seems that around us about half the matter is in stars or dead stars, and half in gas. The gas
is the fuel for making new stars, so the Milky Way can keep going for some time yet.

You may be interested to note that there is no mention above of the notorious “dark matter”. We
shall see in the next chapter that dark matter is very important for the Milky Way overall, but in
the vicinity of the Sun it’s density is negligible.
7: THE STRUCTURE AND DYNAMICS OF GALAXIES

7.1. Outline of content


• Galaxy Types and Structures
• Quantitative Morphology
• Disc rotation and masses
• Stellar motions and interactions in spheroids
• Galaxy collisions
• Galaxy components summary

7.2. Galaxy Types and Structures

7.2.1. Morphological components of galaxies

In the previous chapter, we saw how galaxies contain gas and dust as well as stars. As well
these physical components, galaxies contain characteristic structural shapes, or morphological
components. These are illustrated schematically in Fig. 42, and you can see real examples in
Figs. 34 and 43, and also in the map of our own Milky Way in Fig 31. First, we see a thin flat
disc, which also contains spiral arms. The spiral arms are of course much easier to see when
we happen to see a galaxy face-on. Towards the middle of the galaxy we see a central bulge.
This is much easier to see when we happen to see the galaxy edge-on. (It has been said that
the Milky Way is like two fried eggs back-to-back). The final component is less obvious to the
eye - a spheroidal halo which extends much further than the bulge, and even further than the
disc. The globular clusters are part of this system, spread around a galaxy in all directions.

Not all galaxies have all of these morphological components, and some don’t have any of them
- NGC 1427A, which you can see in Fig. 43, is just an irregular mess! Likewise, not all galaxies
have all the standard physical components. Is there a pattern to the manner in which components
go together? The answer is yes there is a pattern, and it gives us big clues to how galaxies were
formed.
103
104 INTRODUCTORY ASTROPHYSICS

Figure 42. Schematic illustration of typical galaxy components. On the left is an edge-on view,
showing the flat disc, central bulge, and spheroidal halo, which contains the globular clusters. On
the right is a face on view showing disc, bulge, and spiral arms.

7.2.2. Galaxy types

When we look at which components go together, we find that there are three basic types of
galaxy - spiral, elliptical, and irregular. The pattern is explained in the following table.

Spirals Ellipticals Irregulars


flat disc with spiral arms no disc no disc
central bulge and spheroidal halo single spheroidal component sometimes bulge/halo
blue and red stars red stars only blue and red stars
lots of gas no gas lots of gas

The relative size of bulge and disc varies quite a lot in spirals, but both components are always
there. The difference in star colours suggests that ellipticals stopped making stars a long time
ago (the massive blue stars have all disappeared), whereas spirals are continuing to make stars.
The cold gas that we see in spirals is the material that is needed for making new stars. Ellipticals
don’t have any gas, so they can’t make new stars. The ionised gas, and the bluest light, trace out
where the latest star formation is going on, and it is fairly clear that this is concentrated along the
spiral arms. By contrast, the bulge and halo regions of spirals are relatively red, and gas free.
Very roughly speaking, it looks as if spiral galaxies are essentially gas discs sitting in and around
ellipticals. The irregulars, like the spirals, contain lots of gas, and blue stars. They are actively
forming stars, but the gas structure is not in a nice simple thin disc. It turns out there are two
reasons for this lack of a disc. Sometimes it just because the gas has not “settled down” into a
disc. Sometimes it is because the disc has been disrupted by collision with another galaxy. We
will look at these colliding galaxies more carefully later in this chapter.

Exercise 66. The colours of a galaxy imply that it contains no stars larger than 1.5M . When did
it stop making stars? (Hint: review the material about stellar lifetimes from the previous chapter.)
7: THE STRUCTURE AND DYNAMICS OF GALAXIES 105

M83 VLT M60 HST

NGC1427A HST M31

M32 HST

NGC5907 SDSS M104 HST

Figure 43. Examples of the basic galaxy types. Top left is a spiral galaxy, and top right an elliptical.
Middle left is a dwarf irregular, and middle right is a dwarf elliptical, sat next to its massive spiral
neighbour. The lowest row contrasts two edge-on spirals, one of which has a very small bulge and
one of which has a very large bulge. “VLT” refers to the Very Large Telescope (Credit: ESO); “HST”
refers to the Hubble Space Telescope Telescope (Credit: NASA/ESA); “SDSS” refers to the Sloan
Digital Sky Survey (Credit: SDSS Consortium)

7.3. Quantitative Morphology

Pictures are nice, but as scientists we like numbers. Furthermore, the structures we have
discussed are not neat sharp-edged shapes, but rather fuzzy things. Is there a more quantitative
way we can study morphology? There is, and when we do so we also learn extra intriguing
things.

The key is to measure the surface brightness of a galaxy, and to see how this changes with radial
distance from its centre. The idea is illustrated in Fig. 44.
106 INTRODUCTORY ASTROPHYSICS

Figure 44. Measuring the light profiles of galaxies. The idea is to measure the amount of light
coming from each small patch of the galaxy, and to see how this changes with radial distance R
from the centre of the galaxy, as indicated on the left. What we actually measure is the angular
distance θ on the sky from the centre of the galaxy, as indicated on the right. If we know the
distance D we can convert θ to physical distance R.

7.3.1. Light profiles of spiral discs

The light profiles of spiral discs turn out to be surprisingly simple. They have spiral structure
and clumpy patches of course, but averaging over this, we always find that the brightness I as a
function of separation from the centre R is given by:

(7.1) I(R) = I0 e−R/Rd

Here I0 is the central surface brightness, and Rd is a characteristic disc scale length. Furthermore,
it turns out that nearly all spiral discs have pretty much the same value of I0 , but different values
of Rd . How much light is there in total? First consider the flux within an annular ring of width dR:

F (R)dR = 2πRI(R)dR

R∞
What we
R ∞ n −x want is then Ftot = 0
F (R)dR. To solve this, we note the standard integral 25
0
x e dx = n!. If you make the substitution x = R/Rd you can match to that integral with
n = 1 and you get the result

(7.2) Ftot = 2πRd2 I0

Given that I0 is always pretty much the same, we see that a brighter galaxy is simply a bigger
galaxy.

25
Those who know enough calculus can try solving the integral. Its done using integration by parts. In the solution,
n! means n(n − 1)(n − 2)...(2)(1) and is pronounced “n factorial”
7: THE STRUCTURE AND DYNAMICS OF GALAXIES 107

7.3.2. Light profiles of ellipticals and bulges

Ellipticals are observed to have an exponential light profile, but rather than being exponential in
R like spirals, they are found to be exponential in R1/4 :

1/4
(7.3) I(R) = I0 e−(R/Re)

There is no special deep reasoning behind this formula - it is just empirically found to give a
good fit. The challenge to theories of galaxy formation is then to make models which produce
this profile. Note that this profile can be integrated in a similar (only slightly more complicated)
fashion to give Ftot = 8!πI0 Re2

Bulges also show this “R1/4 ” law. Real spiral galaxies show a mixture of bulge and disc light. By
fitting these functions, you can objectively measure the ratio of the bulge component to the disc
component in any one galaxy. Two examples are shown in Fig. 45.

Log surface brightness!


Log surface brightness

Linear radius (arcsec) Linear radius (arcsec)

Figure 45. Surface brightness profiles. Left : the spiral galaxy NGC 0941, which shows an
excellent fit to an exponential over a large range of surface brightness. This means it is almost
purely disc, with very little bulge (From Bakos et al 2012). Right : the galaxy PGC38502. In this
case the outer parts fit an exponential, but the inner part is better fitted by an R1/4 law - this is the
central bulge. (From MacDonald et al 2008). Note: these figures contain a lot of extraneous detail
you can ignore - they have come directly from research publications.

7.4. Disc rotation and masses

To really get at the physical properties of galaxies, we need to study their kinematics, i.e. their
motions, and then apply Newton’s laws. Studying the kinematics of galaxies actually breaks
into three parts. First, we look at the overall internal motions of galaxies - are they expanding,
collapsing, stationary, rotating? Then we look at the motions of individual stars inside galaxies.
Then finally we look at the motions of galaxies with respect to each other, and find that they often
108 INTRODUCTORY ASTROPHYSICS

collide with each other. But how do we measure galaxy motions? They are so far away that they
essentially never change their apparent position in the sky. Instead, we detect their motion in the
line of sight using the Doppler effect.

Exercise 67. A galaxy at a distance of 10 Mpc is moving at 500 km/s transverse to our line of
sight. How long would it take for its apparent position on the sky to change to 1 arcminute?

7.4.1. The Doppler effect

Imagine an object radiating waves (sound waves, light waves, water waves - its the same story),
which we can picture as expanding wavefronts as illustrated in Fig. 46. Now consider the object
in motion at speed v; each wavefront is emitted at a slightly different position. The result is
that the wavefronts are compressed in the direction of motion and stretched in the opposite
direction, so that the wavelength seen by an observer is shortened or lengthened respectively -
i.e. blueshifted or redshifted. The result is that the wavelength shift ∆λ = λ − λ0 is given by

∆λ vr
(7.4) =
λ vw

where vr is the radial velocity of the object - i.e. the component of velocity in our line of sight -
and vw is the wave velocity (equal to c for light of course). Note that an object moving across the
line of sight makes no effect. We should also note that the formula above is appropriate for low
velocities - if we have a velocity near the speed of light we need Einstein’s relativity theory to get
the exact formula.

7.4.2. Galaxy rotation

Because galaxies are spatially extended objects, we can study their kinematics by the use of
spectral features and the Doppler effect. At each location on the sky we can obtain a spectrum of
some kind, and identify a spectral feature. An example would be the Hα transition of Hydrogen,
from n = 2 to n = 3, which appears in the spectra of some stars due to absorption in their
atmospheres. The integrated starlight from the many stars in a patch of a galaxy therefore
often shows this feature. According to atomic physics, it should be observed at a wavelength
λ0 = 656.3nm. What we typically find in spiral galaxies is that the light from one side of the
galaxy is blueshifted and the other side is redshifted, as illustrated in Fig. 46. This is the sign of
rotation - as the galaxy rotates, one side is rotating away from us, and so is redshifted, and the
other side is rotating towards us and so is blueshifted. From the observed shift ∆λ we can then
work out the rotation velocity as a function of radius, V (R).

A typical result is shown in Fig. 46. As you move out from the centre, at first the galaxy picks
up speed, and then it flattens off, and stays flat for as far as it we can measure. When first
discovered, this was a big surprise. For planets orbiting the Sun, V declines as V (R) ∝ R−1/2 -
the force of gravity is declining as you get further from the gravitating mass. What is going on?
7: THE STRUCTURE AND DYNAMICS OF GALAXIES 109

Figure 46. Galaxy rotation. Upper : illustrating how the Doppler effect works. Lower left : Cartoon
illustrating how we observe galaxy rotation - a spectral feature is redshifted or blueshifted depending
on which part of the galaxy it is coming from. Lower right : typical galaxy rotation curve. The curve
labelled “disk” shows the rotation curve you would predict from the observed starlight if all the mass
is in stars like the Sun. The curve labelled “halo” shows the modelled effect of dark matter. The
upper curve shows the overall model fit. (From Albada et al 1998)

7.4.3. Spiral galaxy masses and dark matter

In the solar system, essentially all the mass is in the middle of the system, i.e. in the Sun. A
galaxy on the other hand has its starlight, and so presumably its mass, smoothly extended. How
do we take account of this? We start with observed luminosity as a function of radius L(R). Next
we assume that on average stars are pretty much like the Sun (a good approximation), then we
can get mass as a function of radius, M (R) = L(R) × M /L . If you are a star sitting at radius R,
then the force you feel depends on the mass interior to R. (This was proved in the Stars section
- see eqn 1.15). So using Newton’s laws as normal, we can predict the expected rotation curve
due to stars, V∗ (R). This is shown in the lower curve in Fig. 46. The predicted rotation curve still
disagrees with the observations badly.

One way to explain this is to propose that there is material which doesn’t shine which we haven’t
included in our calculation - dark matter. If we take the observed rotation curve, we can calculate
the total mass there must be within radius R. As usual, V 2 = GM/R, so Mtot (< R) = RV 2 /G.
To achieve our flat rotation curve, the mass must be increasing linearly with distance:

1
(7.5) Mdark (< R) ∝ R and so ρdark ∝
R2
110 INTRODUCTORY ASTROPHYSICS

The density ρ(R) is deduced on the assumption that the “dark matter halo” is spherical. You can
see that in the central parts of the galaxy, the mass is dominated by stars, but the outer parts
are dominated by the dark matter halo. Out to where we can make our measurements, the dark
matter halo is roughly ten times as massive as the stellar galaxy. Galaxies vary somewhat in
size, but a large galaxy like the Milky Way has about 1011 M in stars, and 1012 M in dark matter.

Exercise 68. The neutral Hydrogen (HI) emission from a galaxy is measured, and after correction
for the mean recession velocity of the galaxy as a whole, the HI line from one side of the galaxy
is found to be at λ = 21.35cm and from the other is found to be at λ = 21.39cm. Estimate the
rotation velocity of this galaxy.

Exercise 69. The Sun is estimated to be at a distance of 8.5kpc from the centre of the Galaxy.
Stars in the solar neighbourhood have slight random velocities, but overall seem to be rotating
around the Galactic centre at 225 kms−1 . How many times has the Sun been around the Galaxy?

7.5. Random star motions and interactions in spheroids

If we try the same trick with elliptical galaxies, we find that they do not rotate. On the other hand,
their spectral features are smeared out in wavelength. This is also due to the Doppler effect.
The stars are not rotating in a common plane; they are buzzing around in all sorts of random
directions - sometimes in circular orbits, sometimes in very elliptical ones. Although they do not
show a systematic rotation, the typical random velocity should still be given by Newton’s laws,
so that we expect V 2 ∼ GM/R where M and R are the mass and size of the galaxy. We can
then use the observed line-smearing to find the spread of velocities σV , which we can take as a
typical velocity v, and then use this to estimate mass. The result is similar to spirals - the stars
in elliptical galaxies are buzzing about at speeds of several hundred km/sec, and we deduce
that they have a factor ten too much mass compared to the expectation from their starlight. In
fact, if we look at spiral galaxies more carefully, we find that although the stars in the disc are
rotating systematically, the bulge and halo stars are moving chaotically in random directions, like
the stars in elliptical galaxies. This is a general pattern for spiral galaxies - they have multiple
components with different kinematics.

7.5.1. Stellar collisions

If the stars in a spheroid are moving randomly, rather than bodily together, will they collide with
each other? Perhaps a spheroidal galaxy is like a gas, with stars jiggling around at random and
constantly interacting with each other? Lets calculate how often stars should collide.

Consider two stars each of radius R∗ , passing by each other with separation D. They will collide
if D < Dmin = 2R∗ . However, the size of stars is much less than their typical separation Rsep , so
in any one passage there is only a tiny probability that the stars will collide.

Exercise 70. If most stars are like the Sun and the typical separation is about 1pc, estimate the
probability of collision in each passage past another star.
7: THE STRUCTURE AND DYNAMICS OF GALAXIES 111

However, there are many chances of collisions as the star criss-crosses the galaxy over billions
of years. As the star moves, it presents a target cross-section 26 to the other stars of area
2
σ = πDmin = 4πR∗2 . You can think of it as sweeping out a cylindrical volume as it moves. (See
Fig. 47). If the star is moving at speed v, after time t the volume of the cylinder will be V = σvt.
Suppose now that the number density of stars is n per unit volume; then the number in our
cylinder will be N = nV = nσvt. This will be the number of stars that our target star will collide
with in time t. If we set N = 1, then this will give us the timescale between collisions :

1
(7.6) tcoll = where in this case σ = 4πR∗2
nσv

The typical random velocity of a star in a spheroid might be v = 200 km/s. Putting this number
we find tcoll = 7.6 × 1017 years - much longer than the age of the Universe!! To a very good
approximation, stars never physically collide. Compared to their sizes, they are just too far apart.

star mass m1 velocity v


fast approach
slow approach
R
impact
parameter RI R gravitational force
F= Gm1m2/R2

star mass m2
vt

Figure 47. Left: illustrating the idea of swept-out volume for calculating collision rates. R is the
interaction radius (e.g. 2R∗ for physical collisions), and vt is the distance travelled in time t. Right:
geometry for gravitational “collisions”. The key parameters are the incoming velocity v and the
impact parameter RI . A closer approach and/or a lower velocity will result in a larger deflection.

7.5.2. Gravitional interactions

Just because stars don’t collide, doesn’t mean they have no influence on each other. Every star
produces a gravitational force on its neighbours. As one star passes another, even when they
are too far apart to physically collide, the gravitational interaction produces a deflection in its
path, as illustrated in the right hand panel of Fig. 47. A closer passage will produce a larger
deflection. Referring to figure 47, we can characterise a passage by its “impact parameter” - the
distance RI of closest approach that would be seen if there was no deflection. Take one star
to be stationary and the other to be passing at speed v, and consider the gravitational force on
the moving star. Resolve the force into components along the asymptotic approach line, and
perpendicular to this. The horizontal component switches sign as the star passes, so the net
effect is zero. The perpendicular component however is always in the same direction. Now we
make a rough approximation, that the force is only significant while R ∼ RI . This is true for a time
∆t ∼ 2RI /v and so the force can be taken as roughly constant during this time at F ∼ Gm2 /RI2

26
Note that σ is the symbol used for a cross-sectional area in many different areas of physics
112 INTRODUCTORY ASTROPHYSICS

Now we apply Newton’s law, F = ma = mdv/dt. So if we apply a force F for a time ∆t it will
produce a change in velocity

dv 2RI F 2RI Gm 2Gm


∆v = ∆t = = 2
=
dt v m v RI RI v

The smaller RI is the larger the deflection. We can define a major deflection as one which gives
a change of velocity similar to the star’s initial velocity, ∆v = v. If we then solve for this value of
RI this gives a gravitational interaction radius

2Gm
(7.7) Rgrav =
v2
Exercise 71. An alternative way to estimate “gravitational radius” is to find the distance where
the kinetic energy of the passing star is equal to the gravitational potential energy between the
two. Assuming the two stars are similar, work out the value of Rgrav this gives and compare it to
the ∆v method.

To summarise: if two stars pass each other at a distance much larger than Rgrav , they will swing
by essentially unaffected. If they pass closer than Rgrav , the stars are significantly deflected in
their orbits. Of course, in reality, you get gradually more deflection as you get closer. We have
drawn a kind of dotted line and decided to label the deflection “big” inside this. For m = 1M
and v = 200 km/s, we find that Rgrav = 9.53R ; so stars can interact over a significantly larger
radius than the physical collision size. If we use this gravitational “collision” size in equation 7.6,
get a collision timescale of ∼ 3 × 1016 years, still much longer than the age of the Universe. Have
we oversimplified by considering only the strong interaction radius? After all, weak interactions
are much more common - shouldn’t we integrate over all those? A detailed analysis shows that
this does indeed reduce the effective deflection timescale for a typical star, but only by roughly
an order of magnitude. This still gives an answer longer than the age of the Universe.

7.5.3. The internal dynamics of galaxies

We have reached a fairly strong conclusion. It may be tempting to think of a galaxy as like a
“gas of stars” but the analogy is poor. In a gas (including the gas that galaxies contain) the
particles are constantly colliding with each other, and do not travel very far. In a galaxy, the
stars essentially never collide with each other, and spend their time swinging from one side of
the system to the other. Any one star at any one time does not really care about other individual
stars - it feels a gravitational force from the whole galaxy added up. Note that a collisionless
system is also a frictionless system.

We can now see how spirals and ellipticals will differ. In an elliptical, the stars will be on a random
variety of orbits, most of them criss-crossing the whole galaxy over time. This will also be true
of the halo stars in a spiral. However the disc stars have formed recently from the cold gas, and
will inherit the motion of that gas. The gas, unlike the stars, is highly collisional; neighbouring
patches of gas will stay close together, and through friction over time will settle down into a thin
rotating disc. The stars born out of that disc will likewise be mostly co-rotating.
7: THE STRUCTURE AND DYNAMICS OF GALAXIES 113

Exercise 72. Think about it: In a spiral galaxy, the cold gas has collapsed vertically to a thin disc.
Why does it collapse only in this one direction? Why doesn’t the cold gas collapse to a point?
How would you go about working out what size the gas disc might collapse to horizontally?

7.6. Galaxy collisions

Stars never collide. But galaxies do collide with each other. Many of the galaxies classified as
Irregular turned out to be actually two galaxies in the process of collision. You can see examples
in Fig. 48.

Figure 48. Examples of galaxy interactions, all observed by HST (Credit NASA/ESA). Left :
Stephan’s Quintet. The midlle two galaxies are just beginning the process of interaction. Middle :
NGC 2623, two galaxies in the process of merging Right : Arp 220, in an advanced state of merging.

Why do galaxies behave so differently from stars? It is simply because, compared to their size,
they are much closer together. For stars, with R∗ ∼ R and Rsep ∼ 1pc, we have Rsep /R ∼ 107 .
Galaxies don’t have clear edges, but roughly speaking Rgal ∼ 10kpc and the typical separation
between galaxies is Rsep ∼ 1Mpc. We then get Rsep /R ∼ 102 - very different from stars. We can
see straight away that collisions between galaxies will be much more common.

But because the stars are collisionless, doesn’t this mean that when the galaxies collide, they
just slide gracefully through each other? No, because each star feels the summed gravity from
two different galaxies, and starts to get very confused about how to move! This is why galaxy
interactions make strange non-symmetric disturbed structures as they pass by each other.

If a star is orbiting at distance R from the centre of a galaxy and a second galaxy of comparable
size passes, there will be significant disturbance to that star if the two centres pass within
distance D ∼ 2R. This means that the inner parts of galaxies, at R∼5 kpc say, get disturbed
relatively infrequently, but the outer stars and gas, at say R∼50 kpc, are around a hundred times
more likely to get disturbed by neighbouring galaxies.

Can we work out the collision timescale? To do this we need to know the number density of
galaxies per unit volume of space. However, the density of galaxies varies quite a lot from
one spot to another, and the density depends very strongly on whether we are talking about
big galaxies or small ones... we will return to this question in chapter 9, when we discuss the
geography and history of galaxies. For now we will just note that galaxy interactions are indeed
fairly common.
114 INTRODUCTORY ASTROPHYSICS

7.7. Galaxy components summary

We have seen that ellipticals have no gas, spheroidal structure, old stars, and random internal
stellar motions, whereas spiral discs have cold gas, a flat structure, young stars, and ordered
rotation in a single plane. However, we can see that spiral galaxies also contain spheroidal
components rather like elliptical galaxies. Additionally in the Milky Way we can see that the
different kinematic and structural components correspond to different populations of stars, which
have differing abundances because they were formed at very different ages. Fig. 49 summarises
what this all implies for the overall structure of the Milky Way and other galaxies.

Figure 49. Left: components in the Milky Way and other spirals galaxies (Credit: James
Schombert, Oregon). Right : Artists impression of the dark matter halo the galaxy is embedded in.
(Credit : Astrokatie blog)
8: ACTIVE GALAXIES

8.1. Outline of content


• Properties of Active Galactic Nuclei (AGN)
• Central masses in AGN
• Dark masses in normal galaxies
• Power from black hole accretion

All galaxies except irregulars are brightest at the centre. However a minority have an especially
bright central spot - its as if someone plonked a bright star on top of the centre or nucleus of
the galaxy. Over the years, its become clear that these nuclei show a set of bizarre and very
energetic properties that cannot be due to stars. A galaxy showing such properties is known
as an “active galaxy” and the nucleus concerned as an “Active Galactic Nucleus” (AGN). The
most common type of active galaxy seen locally is often known as a “Seyfert galaxy” after Carl
Seyfert who first studied them comprehensively in the 1940s. The famous “quasars”, discovered
in the 1960s, are actually just very rare, powerful, and distant AGN. We will summarise the key
properties of AGN, and in each subsection analyse the physical implication of each of these
key properties. Warning: a lot of this section will involve some very rough order-of-magnitude
estimating.

Object Typical luminosity


Sun-like star ∼ 1026−27 W
Most massive known stars ∼ 1031 W
Weakest known AGN ∼ 1033 W
Typical nearby Seyfert galaxy ∼ 1037 W
Entire galaxy of stars ∼ 1038 W
Most luminous known quasars ∼ 1039−40 W
115
116 INTRODUCTORY ASTROPHYSICS

8.2. Properties of Active Galactic Nuclei (AGN)

8.2.1. Large luminosities

Given that the light comes from a tiny spot in the centre, AGN radiate a prodigious amount of
energy. This is put into context in the table above. A large fraction (around a third) of all known
galaxies have some kind of very weak active nucleus, with luminosities equivalent to a large star
cluster. However, 1% of galaxies, the Seyfert galaxies which first attracted attention, can have
nuclear luminosities which are around 10% of the luminosity of an entire normal galaxy of stars
like the Milky Way. Even more extreme are the objects known as “quasars”, or “quasi-stellar
objects” so-called because they look like stars in that they are point sources of light. They are
very distant and have large luminosities, up to L ∼ 1039 W or even more. They are in fact AGN in
very distant galaxies. So why do they look star-like? This is partly because they are very distant
so that their parent galaxy has a small angular size, hardly bigger than a point source, and partly
because the quasar is ten to a hundred times more luminous than the parent galaxy, so that the
parent galaxy is lost in the glare.
Exercise 73. If the parent galaxy of a quasar is about the same size as the Milky Way, estimate
how far away it needs to be in order to have an angular size comparable to the seeing size at a
good ground-based observatory. (Hint: review section 4.2.4)

Implications. Because nuclear activity occurs in a large fraction of galaxies, it is a quite normal
phenomenon. However it is not due to stars. What physical phenomenon can explain such a
large amount of energy coming from such a small spot? Lets gather more information before
returning to this key question.

8.2.2. Broad emission lines

Optical spectra of AGN, as shown in Fig. 50, show strong emission lines as well as very blue
continuum light. Normal galaxies do quite often show emission lines as well, coming from gas
photo-ionised by new stars, which are hot enough to shine in the UV, as we discussed in chapter
6. The gas in AGN is also being photo-ionised by UV light. In Fig. 50 UV wavelengths are not
shown, but you can see that the spectrum is very blue, and measurements which extend into
the UV confirm that the typical AGN has a large UV luminosity. We will discuss this UV source
further in the next section.

However, the most obviously striking thing about the AGN emission lines is that they are very
broad. The Hα line at 656.3nm in Fig. 50 is about 20nm across.

What does it mean? The broadening of the emission lines must be due to internal gas motions
in the nucleus. Some of the gas is moving away from us and so redshifted, whereas some of it is
moving towards us and so blueshifted. If we apply the Doppler formula, the typical velocity must
be around v = c × ∆λ/λ = c × 20/656.3 = 9, 136 km s−1 which is 5% of the speed of light. For
comparison, the orbital velocity of the Earth around the Sun is 30 km s−1 , and the orbital velocity
of the Sun around the Galaxy is 225 km s−1 . If the gas motion in AGN is due to gravitational
forces, the gas must be surrounding an extremely large mass, or be very close to a large mass,
or both. The region containing the gas emitting the broad emission lines is known as the BLR
(Broad Line Region).
8: ACTIVE GALAXIES 117

Flux per unit wavelength

Figure 50. Optical spectrum of a typical AGN (NGC 5548). Note the blue colour of the spectrum,
and the height and width of the emission lines. (Taken from the textbook by Brad Peterson.)

Exercise 74. In a specific AGN, the radius of the broad line region has been estimated as 0.01pc,
and the typical gas motion measured as 12,000 km/s. Estimate the mass of the object causing
the gas motion.

8.2.3. Multi-wavelength spectral energy distribution

Perhaps the most striking thing about AGN, as shown in Fig. 51, is that they radiate strongly at
all sorts of wavelengths - optical, UV, X-rays, gamma-rays, and IR. Ordinary galaxies (see the
lower plot in Fig. 51 for comparison) emit most of their radiation in optical light (which comes from
stars), plus about 10% in the infrared, which comes from dust absorbing some of the starlight,
heating it up, and re-radiating in the far-IR, as we discussed in Chapter 6. Normal galaxies do
emit other kinds of light, as we also discussed in chapter 6, but in the broad picture the other
kinds of light don’t add up to much. The feature which dominates the energy output for most
AGN output however is the “Big Blue Bump”, which peaks in the UV. There are secondary peaks
of emission in the mid-infrared (at a wavelength of around 10µm), and in the X-rays.

What does it mean? There seems to be material at three different temperatures. If we assume
that the emission is blackbody radiation, we can estimate the temperatures concerned, using the
formula λT = 2.9 × 10−3 mK.

• The dominant UV emission peaks around λ = 0.03µm suggesting T ∼ 105 K. However,


the UV bump looks broader than a single blackbody.
• The IR peaks around λ = 10µm suggesting T ∼ 300K. The IR bump is almost certainly
due to dust re-radiation, but the dust is hotter than the dust we normally see in the
interstellar medium in normal galaxies, which is it at T ∼ 50K.
• The X-ray emission covers a wide range of photon frequencies. It is almost certainly not
due to blackbody radiation, but if the emission is due to hot gas in some way, it must
have a wide range of temperatures, 107−9 K.

The size of AGN. If we assume that the dominant UV emission is coming from a spherical
surface of radius R at temperature T , then it ought to have a luminosity
118 INTRODUCTORY ASTROPHYSICS

ultraviolet
Mid!
Infrared optical
Far!
Infrared
X-ray gamma-ray
relative luminosity

Typical AGN
Typical galaxy

wavelength frequency

Figure 51. Very schematic illustration of the spectral energy distribution (SED) of a typical AGN,
compared to a typical ordinary galaxy.

L = 4πR2 σT 4

If we measure the luminosity L, and the temperature T , from the UV peak, then we can estimate
the size R of the UV emitter. Suppose we take as an example a quasar with a moderately high
luminosity, L = 1038 W. To radiate that power at a temperature of 105 K, we find that the radius
must be R ∼ 1.2 × 1012 m - by astronomical standards, a very small size.

Exercise 75. Which does the size of a quasar most closely correspond to? A typical galaxy?
The typical space between stars? The solar system? The Earth?

Exercise 76. The IR luminosity in a typical quasar is about 3 times less than the UV luminosity.
Once again assuming a spherical body for rough estimate purposes, show that the ratio of the
size of the IR emitting region to the size of the UV emitting region is ∼ 64, 000

The IR emission very likely comes from a dusty region surrounding the quasar that is a long
way from the real action, but soaks up quite a large fraction of the outgoing energy. In a similar
manner, can we deduce the size of the X-ray emitting region? Unfortunately, in this case, the
blackbody assumption would almost certainly be wrong, so we can’t use our standard formulae.
The current best guess is that the x-ray emission is coming from sparse, optically thin gas, as
opposed to the opaque surface you need for black body emission. Such thin gas gives much
less emission for a given volume. This situation is reminiscent of the million degree gas that
surrounds the Sun in the corona. It seems likely that in a similar way the UV bump emitting
material has a hot corona of some kind.
8: ACTIVE GALAXIES 119

8.2.4. Variability

Most astronomical objects change only very slowly. Sometimes you have pulsating variables like
Cepheids, and occasionally there are explosive events like supernovae. AGN are different again.
They tend to vary erratically, by anything from 10% to a factor two. Two examples are shown in
Fig. 52. Low luminosity objects tend to vary faster than the high luminosity objects. The dominant
UV emission varies on a timescale of days-weeks in Seyfert galaxies, and months-years in the
higher luminosity quasars.

(X-ray emission)
(UV emission)
Relative brightness

Time in days

Figure 52. Left : UV and X-ray variability in the relatively low luminosity AGN NGC 7469, on a
timescale of days. (From Nandra et al 1997). Right : UV variability in the high luminosity quasar
3c273 on a timescale of years. (From the 3c273 database maintained by Marc Turler in Geneva)

Implications. A large object cannot change fast. For all the parts of an object to go up and
down together, as opposed to varying randomly producing an overall washed out effect, they
need to be physically in contact with each other by some mechanism which transmits the effect
of a disturbance - pressure waves, electromagnetic effects, light signals, or whatever. If such a
disturbance transmission mechanism travels at speed vd , and the size of the object is R, then
the fastest that changes should happen is given by

R
(8.1) ∆t =
vd

Exercise 77. We saw earlier that a luminous quasar with L ∼ 1038 W must have a size R ∼ 1012 m.
If such object varies on a timescale of year, estimate the speed of disturbances vd .

Pressure waves in hot gas. Our estimate of the disturbance speed is very close to the expected
speed of pressure waves, i.e. sound speed. Wave-like disturbances vary somewhat in detail,
but basically they are all transmitted microscopically by lots of particles colliding with each other
in a kind of chain. The speed a disturbance travels is therefore always roughly the same as the
typical particle speed. If we equate the typical thermal particle energy E = 3kT /2 with the kinetic
energy K = mv 2 /2, and assume that most of the particles are Hydrogen atoms or ions with a
mass not very different from the proton mass, then
120 INTRODUCTORY ASTROPHYSICS

s
3kT
(8.2) vd (pressure) ∼
mp

Putting in T = 105 this gives us a speed of 49.8 km s−1 , tolerably close to our observational
estimate above. This doesn’t prove exactly that variability is caused by sound waves, because
we have made a very rough estimate, but some kind of mechanical disturbance seems most
likely.

Summary. Overall we are reaching a reasonably consistent story. Luminous AGN radiate at
a characteristic temperature around 100,000K, with a size around 1012 m, and they tend to be
erratically unstable. Next, we would like to know how massive these objects are, but we will start
a whole new section for that important topic.

8.3. Central masses in AGN

To measure masses we use the tried and trusted technique of measuring a velocity and a radius
and using v 2 = GM/R. We will look at two different ways of doing this.

8.3.1. Resolved motions

Ideally what we want is some emitting material for which we can directly measure both the radial
distance R from the centre, and the velocity v of that material. Unfortunately for most AGN, most
of what we see is in a single unresolved spot. However, for some of the closest AGN, we can
see glowing gas which is ionised by the AGN at large enough distances to visually separate from
the central spot. A famous example is shown in Fig. 53, which shows the result of two spectra
of M87 taken just either side of the nucleus. The Hα emission line is systematically offset in
wavelength on either side of the nucleus. It seems that we have seen a disc of gas rotating -
on one side the gas is approaching us, and on the other side the gas is receding. We know the
angular separation of the two spots and the distance of M87 so can get the linear size. From the
wavelength separation ∆λ we get V from the Doppler effect. The result is that we find a mass
causing those motions of M ∼ 3 × 109 M . However, within the same radial distance we can
estimate the amount of starlight, and so the mass in stars. (M87 has radio jets, but no bright
UV bump). This is at most around ten million solar masses. Whatever that large mass is, its not
stars.

Exercise 78. The nearby radio galaxy M84 also seems to show a rotating ionised gas disc,
which can be resolved with HST. Spectroscopic measurements were made 0.1” either side of
the nucleus of the NII line at 658.3nm. The apparent velocity difference between those two
measurements is 800 km s−1 . Given that M84 is at a distance of 17 Mpc, estimate the mass of
the central object.

Exercise 79. Given that we seem to be looking at a flat rotating disc, can you think of a significant
uncertainty in this estimate, that comes purely from geometry?
8: ACTIVE GALAXIES 121

Figure 53. Gas disc rotation in M87, as measured by Harms et al 1994 with Hubble Space
Telescope. (Credit : NASA). The inset on the right shows an image of the central region of
the galaxy, with the two circles showing locations where light was collected to put through the
spectrograph. The line diagram on the left shows a small portion of the spectra from each location,
showing a strong emission line, but at differing wavelengths at the two locations.

8.3.2. Emission line widths and time lags

For our second method, we deduce the size of the line-emitting gas region by watching the
lines vary in strength. We have mentioned that the UV light from AGN is variable, and that the
emission lines are ionised by that UV light. As you might expect, the lines respond to those UV
variations and go up and down as well - but with a time lag. The time lag tlag is due to the light
travel time from the UV source at the centre of the AGN to where the emission line gas is, and
therefore gives us an estimate of the size of the broad line region, RBLR = ctlag . This effect has
been seen in a few dozen AGN now. An example is shown in Fig. 54, with a time lag of around
14 days. A more luminous object, such as we have been using as our canonical example with
L ∼ 1038 W, will have a delay around ten times as long. This implies a size of RBLR ∼ 3 × 1015 m,
about a thousand times larger than the size we deduced for the UV bump emitting region.

Once we have the size RBLR , we can use the line width to estimate the velocity of the BLR gas,
2
VBLR , and we get the central mass that the gas motions are responding to, M = RBLR VBLR /G.
Using the time lag method, a range of masses have been found for AGN central objects, covering
roughly 107−9 M .
122 INTRODUCTORY ASTROPHYSICS

Figure 54. Continuum and emission light curves for the AGN MKN335. Note the continuum light
curve shows the optical light, but we assume that the UV light tracks this. The measured delay is
13.9±0.9 days. (From Grier et al 2012).

8.3.3. The range of AGN masses

We noted in section 8.2.1 that AGN cover a large range in luminosity, about a factor of 106 . The
range in mass is somewhat less, about a factor of a thousand, from a few times 106 M to a few
times 109 M . This is reminiscent of the situation with stars, which range over about a factor of
a billion in luminosity, but only about a factor of 500 in mass. However, there is a key difference
between stars and AGN in this regard. For stars, or at least for main sequence stars, the mass
pretty much determines the luminosity - its just that its a very steep dependence. For AGN,
although mass and luminosity are connected, the correlation is poor. At any one mass, AGN can
show a large range of different luminosities. It seems that for AGN there is a second parameter
involved. We shall see below what that parameter is.

8.4. Dark masses in normal galaxies

There is then clear evidence for some kind of object with a large mass in the centre of Active
Galaxies. But in fact there is also evidence for massive objects in the nuclei of normal galaxies.
It seems that the massive dark object is always there - its just that sometimes it is active and
sometimes it is not. The evidence for dark objects comes from the motion of stars.

8.4.1. Central stellar motions

By measuring the width of absorption lines in the light from a galaxy, we can estimate the speed
of random motions of the stars in that galaxy. If we make such a measurement at a series of
different locations within the galaxy, we can see that stellar velocities start to increase rapidly
towards the centre of most galaxies. The stars are responding to the gravity of some kind of
central object. Once again, we can use the Doppler formula plus V 2 = GM/R to estimate the
mass concerned. The results we get cover a wide range, similar to that found for AGN: 106−9 M .
From the starlight we can estimate the stellar mass, and it is always far less. A dark central mass
seems to be ubiquitous in normal galaxies.
8: ACTIVE GALAXIES 123

8.4.2. The black hole at the centre of the Milky Way

Figure 55. Orbits of stars in a very small region of the Galactic Centre showing motion around the
central dark object. (From the UCLA Ghez et al website ; see also the Max Planck Genzel et al
website.)

The centre of our own Galaxy shows some extremely feeble signs of some kind of possible
nuclear activity. However it shows very clear signs of a dark and compact central mass. There is
a cluster of stars about 1pc across at the centre of the Milky Way. Over many years, two groups
in Germany and California have been imaging these central stars repeatedly, and have watched
their positions slowly change. (See Fig. 55.) These stars are clearly orbiting an unseen object.
From the star orbits, we can locate the position and mass of this object, which turns out to be
M ∼ 4.3×106 M . Note that at a distance of 8.3kpc, 1pc is about 2500 across. Although the image
of each star is somewhat fuzzy, we can measure the centroid position of each star to much less
than the size of the fuzzy blob. 27.

As well as being invisible, this dark object is extremely compact. One of the star orbits (for the
star named S2) has a pericentre that comes within 120 A.U. of the position of the dark object.
The dark mass must have a size less than this. For an object to have a mass of several million
times the mass of the Sun, but has a size smaller than the solar system, its hard to imagine its
anything other than a black hole.

27
Progress has required two key pieces of technology - IR detectors, and Adaptive Optics. I will show some pictures
in lectures
124 INTRODUCTORY ASTROPHYSICS

Exercise 80. Recall that the event horizon of a black hole is at REH = 2GM/c2 . (See eqn 3.31).
How close to the event horizon does the star S2 get on its orbit about the Galactic Centre black
hole?

As you can see from the above exercise, the dark object could still be quite a lot bigger than the
event horizon of a black hole. However, for an object this massive, there is no known mechanism
that can support it, so it will soon anyway collapse and end up as black hole. The timescale for
this is short by astronomical standards, as you can work out for yourself in the next exercise.
Exercise 81. Suppose that the dark mass in the Galactic Centre is actually not a black hole, but
a cold sphere of gas of radius just slightly less than the pericentre of the star S2, R = 100AU .
How long would this take to collapse to a black hole state? To estimate this, use the same
technique as Exercise 8 in section 1.4.1.

8.5. Power from black hole accretion

We have seen that some galaxies contain “active nuclei” that somehow produce enormous
amounts of energy. We have also seen that dark masses - probably black holes - are common in
almost all galaxies. What is the connection? How does this black hole produce energy? And why
are some active and some not? The answer seems to be that the energy comes from accretion
onto the black hole.

8.5.1. Energy from gravity

Consider a large mass M which has a small radius R, and then add just a little bit more matter
∆m. If the small extra mass ∆m starts from a large distance and falls onto the large mass M at
radius R, it will have gained energy

GM ∆m
E=
R

If our small mass ∆m is a solid lump, the extra energy E will just end up as kinetic energy.
But if ∆m is a volume of gas, friction and collisions on the way down should randomise the
energy and turn it into heat. That energy can then be radiated back into space. For our large
mass M , the smaller R is, the more energy we get when new matter accretes onto it. The most
energy we could get out of this process is when the object is a black hole, with event horizon
REH = 2GM/c2 . Then

c2 1
E = GM ∆m · = ∆mc2
2GM 2

You will recall that mc2 is the rest mass energy according to Einstein’s relativity. We can define
the efficiency of energy extraction of any energy generating process as µ = E/mc2 - i.e. the
fraction of the rest mass energy that we extract. For nuclear fusion, this is µnuc = 0.007, so black
hole accretion seems to be much more efficient than even nuclear fusion.
8: ACTIVE GALAXIES 125

Disc accretion. Can we really get µ = 0.5? The answer seems to be “no but nearly”. If the
material plunges radially inwards in free fall, the kinetic energy may not convert to heat before
it disappears inside the black hole. In fact, radial free fall is unlikely. At larger distances, the
accreting material will always have at least a little angular momentum, and end up forming a
rotating disc around the black hole. Friction between neighbouring radial annuli then allows
the material to slowly spiral inwards, forming a gradually heated accretion disc. However, two
effects reduce the efficiency µ. The first, which we can’t prove here but will state, is that no
stable orbits are possible closer to a black hole than R = 3REH . So outside that radius we have
the slowly accreting disc; inside that the material plunges in. So the effective inner radius is
3REH = 6GM/c2 .

Knowing that our infalling material is in a rotating disc, we find a second effect. Consider a
small mass ∆m at radius R. It has PE U = −GM m/R and KE mv 2 /2 with v 2 = GM/R and so
K = GM m/2R. Now suppose our small mass descends a small distance ∆R. The changes in
U and K are

dU GM ∆m dK GM ∆m
∆U = ∆R = ∆R and ∆K = ∆R = − ∆R
dR R2 dR 2R2

i.e. ∆K is half of ∆U . So of the potential energy lost in descent at each step, half goes
into increased rotation speed, and the other half is available to heat the gas; this reduces the
efficiency by another factor two. The net result seems to be that we should be able to get an
efficiency µ = 1/12 from disc accretion. The real situation may be somewhat more complex than
our simple analysis, but generally it is assumed that µ ∼ 0.1 can indeed be achieved.

8.5.2. Accretion luminosity

Above we have considered the energy gained by accreting a specific lump of matter ∆m. Now
suppose we continously rain down matter. The rate of matter infall per unit time is known as the
accretion rate and is usually given the symbol ṁ - i.e. it is the rate of change of mass, dm/dt
in units kg s−1 . In each unit of time an amount of matter ṁ gains a fraction µ of its rest mass
energy. The luminosity we get should therefore be

(8.3) L = µṁc2 with efficiency µ ∼ 0.1

It seems that this formula does not involve the black hole mass. Apparently it doesn’t matter how
big the black hole mass is, as long as we accrete matter fast enough. Is that really correct? Well,
it would be were it not for the fact that any system has an upper limit on its luminosity, which we
look at next.

The Eddington limit. The radiation flowing outwards from a luminous object can interact with
surrounding material and cause an outward radiation pressure. First, recall that a photon, as
well as having energy E = hν = hc/λ, has momentum p = h/λ = E/c. Now, the radiation flux S
28
hitting some material is the amount of energy per unit time per unit area; so this corresponds
to a momentum flux S/c. Most of that flux may pass straight through, but some of it will scatter
28
We are using S instead of F here so we don’t get flux confused with force.
126 INTRODUCTORY ASTROPHYSICS

on the electrons inside atoms. That scattering produces a force on the electrons, which drag the
atoms with them. The scattering process has a cross-section σe = 6.65 × 10−29 m2 . The rate of
momentum transfer per atom is therefore

dp Sσe
=
dt c

However, Newtons law F = ma can be stated as F = dp/dt. So the radiation flux S causes a
radiation force on each atom of Frad = Sσe /c. At a distance R from an object of luminosity L the
flux is S = L/4πR2 , so we will have

Lσe
Frad =
4πR2 c

In astrophysical gas, nearly all the material is Hydrogen, and nearly all the mass of each H atom
is in the proton, so to a reasonable approximation the mass of each atom is mp . This mass also
feels a gravitational force due to the central object of mass M

GM mp
Fgrav =
R2

For a given central mass M , as you increase L, there comes a point where the outward radiation
force will exceed the inward gravity force, and no accretion is possible; the material would instead
be expelled. We can find this limiting luminosity by setting Frad = Fgrav , which gives us the
maximum luminosity:

4πGmp c
(8.4) Lmax = ·M
σe

This maximum luminosity is known as the Eddington limit. In SI units, if we put the constants
in, we get L = 6.37M . This may seem modest - a 1 kg black hole would give us 6.37 Watts of
power, but those AGN black holes are very big... For 108 M we get 1.3 × 1039 W. The luminosity
we get in practice is very much in the right ball park to explain AGN.

8.5.3. Application to AGN

We have arrived at a fairly consistent broad picture.

• Almost every galaxy contains a black hole with mass somewhere between a million and
a billion times the mass of the Sun.
• Material near the black hole settles into a disc, and friction allows it to slowly accrete and
heat up
• The two free parameters in determining the luminosity are black hole mass MH and
accretion rate ṁ
• The luminosity generated by accretion onto the black hole is L = µṁc2
8: ACTIVE GALAXIES 127

• However for a given black hole mass MH there is a a maximum luminosity, proportional
to MH
• Given the luminosities we get and the size of a black hole, the bulk of the radiation is in
the UV

The BLR is located much further out, at around 1000 × REH . The jets observed in some AGN
are probably launched close to the black hole, but this is still a controversial topic.
Exercise 82. For a quasar with total energy output L = 1039 W, what is the minimum black hole
mass required? If we have just that minimum mass, radiating at precisely the Eddington limit,
calculate the accretion rate in solar masses per year
Exercise 83. What is the orbital velocity
√ at radius R around mass M ? Show that the escape
velocity from the same position is 2 times larger than the orbital velocity. Calculate the escape
velocity from the last stable orbit around a black hole.
9: THE GEOGRAPHY AND HISTORY OF GALAXIES

9.1. Outline of content


• Galaxy number counts
• The galaxy luminosity function
• Galaxy clustering
• Galaxy formation

In the lectures, we will take a cosmographic tour. It is useful and interesting to get some context
on where we live. But here in the notes, we will concentrate on taking a brief look at four topics
that will blend into the Cosmology part of the course.

9.2. Galaxy number counts

In section 4.2.2 we derived an equation for how the number of stars N we can see varies with the
faintest flux F we can reach, on the assumption that the stars are uniformly distributed through
space. Exactly the same equation applies to galaxies. If galaxies are distributed uniformly
through space, then the number we see brighter than some flux F should follow

N (> F ) ∝ F −3/2

Is this what we see? Fig. 56 shows the data. Like with stars, at bright fluxes the 3/2 law seems
to be obeyed, but at faint fluxes the slope flattens. Does this mean that we are seeing the edge
of the universe of galaxies, as we saw the edge of the Milky Way system of stars? No, because
there is an extra complication, but a very interesting one - light travel time. 29 Our near neighbour
M31 is at a distance of 778 kpc. Light takes 2.5 million years to get here from there - a long time,
but short compared to the age of the Milky Way, which we found is around 8 Gyr. However, the
faint galaxies in Fig. 56 are at distances of the order of 1 Gpc, corresponding to a light travel
time of 3.3 Gyr. This is comparable to the age of the Milky Way. But how old is the Universe as
a whole? The Cosmology section will address this issue, but the answer turns out to be about
29
Of course, light travel time effects do also apply to stars, but the time lag is very small compared to the age of the
Universe, so is not important in this context.
129
130 INTRODUCTORY ASTROPHYSICS

13.8 Gyr. The properties of the universe as a whole must have changed over this time - in the
past, the number and the luminosity of distant galaxies may have been very different. Turning
this round, just as star counts have been used to deduce the structure of the Milky Way, galaxy
counts can be used to test the evolutionary history of the Universe.

Figure 56. Number counts versus flux for galaxies, scaled to the flux for the Andromeda Nebula,
M31. The dataset is a compilaton from three different surveys, and has been taken from the Durham
cosmology group web pages.

9.3. The galaxy luminosity function

As with stars, as well as measuring the observed number of galaxies versus flux, we would
like to know how common galaxies of differing luminosities are - in other words their luminosity
function φ(L) - the number of galaxies per unit volume of space at a given luminosity L. In section
6.2, we noted that counting stars of different luminosity down to a given flux gives a completely
different answer from counting them within a given volume. This is because luminous stars can
be seen much further away, so they are over-represented in a flux limited survey. Exactly the
same logic applies to galaxies. If the true luminosity function of galaxies is φ(L), i.e. the number
of galaxies per unit luminosity per unit volume, then the observed luminosity distribution N (L),
i.e. the number of galaxies per unit luminosity observed in a survey to flux limit F , is given by:

 3/2
1 L
(9.1) N (L) = φ(L) × Vmax (L) where Vmax (L) = 1/2
6π F

Just as with stars, actually conducting a volume-limited survey is very difficult, so in practice we
measure the luminosity function in four steps. (i) We make a flux limited survey to some flux F .
(ii) For each galaxy in the list we estimate the distance and so the luminosity. (iii) In small ranges
9: THE GEOGRAPHY AND HISTORY OF GALAXIES 131

dL we can count the number we have and so get the observed N (L)dL. (iii) In each of these
“luminosity bins” we can calculate Vmax (L). (iv) We calculate φ(L)dL = N (L)dL/Vmax (L).

Figure 57. Galaxy luminosity function, with luminosities expressed as


multiples of the luminosity of the Sun. The data are taken from
hrefhttp://adsabs.harvard.edu/abs/2002MNRAS.336..907NNorberg et al 2002. The curve
shown is the Schechter function, as described in the text, with the parameters as fitted by Norberg
et al.

A typical result is shown in Fig. 57. Just as with stars, there are actually many more small
galaxies than big ones. However, unlike stars, there is a fairly clear upper-size cut-off. An
empirical function which produces a good fit to the observations is known as the Schechter
function:

(9.2) φ(L) ∝ (L/L∗ )−α e−L/L∗

In other words, its a power-law function with an exponential cut-off. The power-law index is found
to be α ∼ 1.2 and the characteristic luminosity is L∗ ∼ 1038 W - quite similar to the Milky Way.
Exercise 84. Does the Schechter function make physical sense? Does it suggest that the
number of galaxies convergent or divergent to (a) high luminosity, (b) low luminosity? Assume
that the mass of a galaxy is proportional to its luminosity (a fairly good assumption). Integrated
over the luminosity function, does the total stellar mass density in the universe seem to be finite
or infinite?

9.4. Galaxy clustering

The distribution of galaxies is semi-random. There is no grand structure, but its not just uniformly
random - the distribution of galaxies is clumped, somewhat akin to the distribution of people on
132 INTRODUCTORY ASTROPHYSICS

the Earth. Just as people are clumped into villages, towns, countries etc, galaxies are clumped
into groups and clusters. We can study concentrations as distinct objects (analogous to cities)
or we can take a statistical approach to the general pattern of clustering. We will look briefly at
each of these.

9.4.1. Rich cluster masses

Figure 58. Left: the cluster of galaxies Abell 1689, with X-ray emission shown in purple
superimposed on a visible light image showing the galaxies. (Credit: H.Peng et al,
NASA/CXC/MIT/E.-H). Right: the distribution of radial velocities seen in the Coma cluster. The
term “Vp ” stands for “peculiar velocity” which means the velocity with respect to the overall velocity
of the cluster, which is moving away from us with the overall expansion of the Universe. The various
curves are theoretical models of the velocity distribution. (From Merritt 1987)

Here and there, hundreds to thousands of galaxies are clumped together into rich clusters. An
example is shown in Fig. 58. These are beautiful and interesting objects, but they also hold
important lessons.

The galaxies in a cluster are rather like the stars in a globular cluster or an elliptical galaxy -
they are moving around at random under the influence of the gravitational potential of the mass
of the cluster as a whole. As usual, we can measure the radial component of velocity using
spectral features plus the Doppler effect. The right hand panel of Fig. 58 shows the distribution
of observed velocities in the Coma cluster. This is centred on zero (galaxies are moving in all
directions) but with a random spread. Given the spread of velocities, what should we take as the
“typical” velocity? There are various ways to do this. One good way is as follows. You start at
V = 0 where the number of galaxies is a maximum. You then move out to larger velocities until
the number of galaxies with that velocity is about half the maximum. In the case of the Coma
cluster shown in Fig. 58 that suggests Vtyp ∼ 1000 km/s.

However, note that the Doppler measurements give us the line of sight velocity, which will
typically be smaller than the true velocity in 3D space. Suppose we define x, y co-ordinates
in the plane of the sky, and a z co-ordinate along the line of sight. If a particular galaxy has
9: THE GEOGRAPHY AND HISTORY OF GALAXIES 133

velocity components Vx , Vy , Vz , then by the usual Pythagorean geometry, the total velocity must
be given by:

V 2 = Vx2 + Vy2 + Vz2

Because the galaxies are moving in random directions, the relative size of Vz will vary, but on
2 2
average, we should have Vtrue = 3Vtyp . That is the value we should use in our usual V 2 = GM/R
formula. What value do we use for R? There is no hard edge to the cluster - the number density
gradually fades out with radius. A good solution is to find the “effective radius” Ref f , defined as
the radius that contains half of the galaxies. 30 Finally, solving for the mass of the whole cluster,
Mclus , we find:

2
Ref f Vtyp
(9.3) Mclus = 3 ·
G

Taking the Coma cluster example, we have Ref f ∼ 1.5M pc and Vtyp ∼ 1000 km s−1 , which gives
the mass of the cluster as Mclus = 1.0 × 1015 M . Now the Coma cluster has around a thousand
galaxies in it, with on average around 1010 solar masses each, giving a stellar mass of around
1013 M . Of course, each of those likely has a dark matter halo; the total halo mass is probably
1014 M . So observing clusters has revealed even more dark matter between the galaxies.
Exercise 85. If the Coma results are typical, what percentage of a typical galaxy is dark matter,
and what percentage of a cluster is dark matter?

9.4.2. X-rays from rich clusters

X-ray observations show smooth emission filling the space between the galaxies. (This is also
shown in Fig. 58). To make this X-ray emission requires very hot gas - roughly T ∼ 108 K.
This is not blackbody radiation, because the gas is thin and transparent. The mechanism is
“thermal bremstrahlung” or “free-free radiation”. The details need not concern us here, but we
can just note that any kind of thermal radiation peaks at roughly the same frequency/wavelength
as blackbody radiation would - at the frequency where hν = kT . Why do we have such hot gas?

For a gas at 100 million degrees, how fast are the particles moving? The typical particle thermal
energy is Eth = 3kT /2, and we can set this equal to the kinetic energy K = mp v 2 /2, where as
usual we have used the mass of the proton as the typical particle mass.
Exercise 86. From X-ray observations of the Coma cluster it is found that the gas temperature
if T = 8 × 107 K. What is the typical particle velocity? How does this compare with the typical
galaxy velocity in section 9.4.1 above?

It may seem like a cosmic coincidence that the particle velocities and the galaxy velocities are
similar, but in fact its not a coincidence at all. In both cases, the galaxy or the particle can be
seen as a small mass m responding to the gravity of a large mass Mclus , and we just get the
30
Of course, it is possible to do all this more laboriously but accurately by cleverly integrating over the radial density
and velocity distribution.
134 INTRODUCTORY ASTROPHYSICS

good old V 2 = GM/R. There are subtleties of course. For example, at this temperature, the
gas will be completely ionised, so the protons and the electrons can be seen as two distinct but
interacting gases. The protons are much more massive; they gain essentially all the potential
energy. However in collisions with the electrons they share that energy. Its the electrons that we
actually see radiating.

Turning this result round, we can use the X-ray temperature to estimate Mclus ; it gives us the
same answer as the galaxy motions. But you may ask, perhaps the X-ray gas is the dark
matter? Unfortunately not. We can estimate the amount of material from the strength of the
X-ray emission; it is about equal to the mass in stars in the galaxies. So that is a lot, but it still
leaves us with 98% of the mass missing.

9.5. Galaxy formation (non-examinable)

We have covered a lot of material already, so this final section in the Galaxies part of the course
is non-examinable - but do read it, because it will help put things in context, and get you ready
for the Cosmology part of the course.

How did galaxies form, and why did they end up with the properties that we see - especially (i)
why are there two types of galaxy - elliptical and spiral? and (ii) why is there a characteristic
upper galaxy size, as we see in the luminosity function? These questions are not really solved -
they are active research topics. However, we can see the key physical ingredients.

Heating and cooling matters. You might think that you can start with a gas cloud and just let
it collapse under its own gravity. Even for a big galaxy, this would take only a few million years.
However, as we saw with rich clusters, if gas collapses into a dark matter halo it will get hot - not
quite as hot as in clusters, but still millions of degrees. It cannot finish collapsing until cools. The
cooling timescale is very sensitive to the density. A cloud that starts at lower density will take
longer to cool.
Exercise 87. Can you work out why density is so important for cooling time? The process
by which the hot gas radiates is free-free radiation. Without going into details, this involves
Coulomb force interactions between electrons and protons. Each time an electron passes close
to a proton, the Coulomb force deflects it, and while on a curved path, it radiates. How do you
think the radiated power per unit volume of gas will depend on density? Will it go as ρ1/2 , ρ, ρ2 , ρ3 ?

Angular momentum matters. The primordial cloud will always have some small degree of
angular momentum. If the material starts at initial radius Ri and rotates with some velocity Vi ,
each unit mass has angular momentum L = Ri Vi . As the cloud collapses L is conserved but R
decreases so V increases, until it reaches the point where V is equal to the Keplerian velocity
V = (GM/R)1/2 . The final radius will be given by

L2
Rmin =
GM

So perhaps another differentiator for the type of galaxy formed is the spin of the primordial gas -
high spin gas tends to make a spiral galaxy.
9: THE GEOGRAPHY AND HISTORY OF GALAXIES 135

Exercise 88. Suppose we hypothesis that the material that formed the Sun originally started out
at a Galactic radius ten times larger than its current radius, when a gas cloud collapsed to form
the Milky Way. From what you know about the Sun’s motion, what was the rotation velocity of
that primordial material?

It matters when the stars form. For gas, although cooling takes a long time, it does happen
eventually. Once stars form however, there is no dissipation. We have seen that galaxies are
collisionless; once stars have formed, the motions are frozen in. So early star formation may
tend to form spheroids, with stars buzzing around at random, whereas delayed star formation
allows the gas to cool and collapse to a rotating disc.

Mergers matter. Galaxies do not necessarily form monolithically from a single cloud, but by the
gradual merger of sub-units. Because the universe was denser in the past (see the Cosmology
notes...) collisions and mergers were more common earlier in the history of the Universe. Note
that if two disc galaxies collide, the result will not be a nice neat disc, but randomised star orbits.
One theory suggests that this also leads to a burst of star formation which then blows gas out of
the galaxy, leaving an elliptical galaxy behind.

The dark matter matters. While the gas is hot, it finds it hard to collapse. But the dark matter,
like stars, is thought to be collisionless, and so can start collapsing sooner. The dark matter
halos therefore form first, and the gas later falls into pre-existing potentials.

The quasars matter. Recall from the previous section that it seems likely that all galaxies contain
a black hole, but that the stars have around 103 times as much mass as the black hole. However,
the black holes have generated energy with an efficiency around 103 times larger than the stars.
Very roughly then, the energy generated by stars and by accretion has been about the same.
This seems unlikely to be a coincidence. The suspicion is that a black hole keeps growing until it
becomes so luminous that its energy output can somehow stop gas infall and new star formation
in the parent galaxy. This is known as feedback.

Galaxy formation is still very much an active research topic, but these are the key issues that
people are arguing about today!
Cosmology

10: The Expanding Universe

10.1. Outline of content


• Olbers’ Paradox
• Hubble’s Law
• The Horizon
• The Copernican and Cosmological Principle

10.2. Olbers’ Paradox: why is the night sky dark?

Our first observation of the Universe is darkness. On a clear evening in Edinburgh you’ll see
many stars in our galaxy and with a pair of decent binoculars, you’ll be able to see our nearest
neighbour galaxy Andromeda, but generally you see dark skies. Lets use this basic observation
you make every day to say something fundamental about the Universe!

It was a simple question for Olber to to ask in 1823 (although he wasn’t the first Kepler and
colleagues were thinking about this back in 1600); ’If our Universe is infinite containing an infinite
number of stars, why is the sky at night dark?’.

Lets use some simple physics to answer this question:


Exercise 89. First we construct a toy model of the Universe where every star has a radius
R = 109 m and the stars are distributed evenly throughout the cosmos with a stellar density
n = 10−60 m−3 . (This is about right on average.) With this toy model how far out in the Universe
would we need to look until we saw a star in every direction? If you’re unsure where to start;

• The cross-sectional surface area of a star is A = πR2 .


• Every line of sight will end on a star when A × D = 1/n

You should calculate a distance of D ∼ 1041 m (pretty big).


Exercise 90. How long will it take for the light from all these stars to reach us (light travels at a
finite speed)?
137
138 INTRODUCTORY ASTROPHYSICS

You should calculate a time of t ∼ 1033 s ∼ 1025 years for the light from the furthest star to reach
us (quite a long time).

Important Conclusions: The night sky is dark from which we can conclude from our toy model
that the Universe is either

• smaller than D ∼ 1041 m and therefore not infinite or


• younger than t ∼ 1025 years

From the most basic of observations we’ve made pretty some strong conclusions about
the Universe.

As a scientist you should now be questioning how good our toy model was. We know stars
actually group together in galaxies instead of being scattered randomly through the cosmos. We
also know that our Universe is expanding and that stars emit at all different wavelengths and
that more distant stars get red-shifted which changes the wavelength of their light (more on that
later). Could these change our conclusions significantly? Discuss!

10.3. Hubble’s Law: everything is moving away from us

We’ve made some fairly strong conclusions about the size or age of our Universe based on our
nightly observation of darkness. In order to find out which of our conclusions were right we turn
to a second observation, this time made by Edwin Hubble and collaborators back in the 1920s.

Recall from Galaxies section 7.4.1, that if galaxies are moving away from (or towards) us their
light experiences a Doppler shift towards the red (blue) end of the spectrum. Recall from
Stars section 1.3 that stars emit a spectrum of light with characteristic absorption and emission
features at fixed wavelengths which correspond to changes in electron orbits within the Hydrogen
atom. As galaxies are just a collection of stars, their spectra resembles a stars spectra, and
hence we can look for Doppler shifts in the same absorption and emission features that we see
in nearby stars.

Figure 59 shows a spectrum of a typical nearby galaxy taken by the Sloan Digital Sky Survey
(SDSS). You can see many of the features in the spectrum have been identified as different
chemical transitions. Lets focus on the Hydrogen lines, or the ‘Balmer Series’, which refers to
transitions to and from the n = 2 orbit shell.
Exercise 91. Using the Rydberg formula 1.12 (see Stars section 1.3), verify the ‘rest-frame’
wavelengths (i.e the wavelengths that we would measure in the lab, and the wavelengths that
the light was emitted in) of the Balmer Series Hydrogen lines in the table below.

Name Transition λe Ȧ
Hα 3→2 6562.8
Hβ 4→2 4861.3
Hγ 5→2 4340.5
Hδ 6→2 4101.7

Exercise 92. Using the data table above, calculate the Doppler Shift and hence the recession
velocity of the galaxy whose spectra is shown in Figure 59.
10: THE EXPANDING UNIVERSE 139

Figure 59. An example galaxy spectra from the Sloan Digital Sky Survey: Sky Server. Spectral
lines have been identified, in particular the Balmer lines, shown with a black box and arrow.

Hint: Compare the observed wavelength of the Balmer line spectral features to the rest frame
emitted wavelengths and use
λo − λe vr
(10.1) =
λe c

Hubble performed this exact measurement, looking at the spectra of around 25 galaxies, finding
that each galaxy was traveling away from us at speed vr . He then compared these recession
velocities with the distances that he had measured to the same galaxies using Cepheids (see
Galaxies section 4.4), he found that the recession velocity was directly proportional to the distance
of that galaxy from Earth D.
(10.2) vr = H0 D
where the proportionality constant H0 was named after Edwin and is known as Hubble’s constant.
The best measurements today find H0 = 72km s−1 Mpc−1 .

Lets use this law and some simple physics to estimate the age of the Universe:
Exercise 93. First lets construct a toy model of a galaxy which at the start of the Universe is at
the same point in space as our galaxy. Today this galaxy is separated from us by a distance D
and it has taken the time τH to get there (τH is the current age of the Universe coined the Hubble
time). If we assume that Hubble’s constant H0 is constant at all times we can use Hubble’s Law to
estimate the age of the Universe τH . If you’re unsure where to start remember your high-school
physics; speed = distance travelled / time taken.

For H0 = 72km s−1 Mpc−1 , you should calculate τH = 1/H0 = 13.6 Gyrs. The tricky part is
converting the units from km s−1 Mpc−1 to s−1 (use 1pc = 3.0856 × 1016 m) and then seconds to
years.
140 INTRODUCTORY ASTROPHYSICS

Side note for the budding astrophysicists in the audience, this calculation gives us a good enough
estimate of the age of the Universe for the purposes of the course but it is flawed as Hubble’s
constant changes with time. To do this calculation correctly you need to integrate taking into
account the geometry and changes in the expansion rate of the Universe.

10.4. Is the Universe finite or infinite? Does it have a centre and an edge?

We can use our findings from Olbers’ Paradox and Hubble’s Law to answer the question is the
Universe finite or infinite. As the Universe is 13.6 Gyrs and therefore younger than the t ∼ 1025
years we calculated for a bright night sky, we can conclude that the night sky is dark simply
because there has not been enough time in the Universe for all of the light from an infinite
Universe to have reached us. So the Universe could be infinite, but the part of the Universe that
we can observe is certainly finite. The observable extent of the Universe that we can see we call
the Horizon.
Exercise 94. Does the Universe have a centre? If it does we are in it; Hubble observed
everything moving away from us! We therefore need to ask ourselves two important philosophical
questions

1. Are we so special that we live at the centre of the Universe?


2. Are we in a special part of the Universe which is different from elsewhere in the Universe?

Come to your own conclusions and discuss.

Cosmology is built on the following two principles:

• Copernican Principle - we do not live at the centre of the Universe


• Cosmological Principle - the Universe is homogenous and isotropic on large scales
(at any instant the Universe has the same properties everywhere).

Armed with these two principles we can conclude that whichever part of the Universe Hubble
made his measurement he would come to the same finding and conclude:

Hubble’s law: Each galaxy is traveling away from every other galaxy at speed V which is directly
proportional to the distance between the galaxies D.

For an infinite Universe, there is clearly no edge. For a finite Universe there is also no edge. In
a finite Universe if you set off in your space ship in one direction you would eventually return to
the spot where you left (imagine walking a giant sphere in a 2D Universe on the surface of the
Earth)

You might ask if the Universe is expanding, what is it expanding into? The cosmologists answer
is that the Universe is everything, so there is no ‘what’; everything is expanding.
11: THE BIG BANG AND INFLATION

11.1. Outline of content


• The cosmic scale factor R(t)
• The critical density ρcrit
• The density parameter Ω
• Open, closed and flat Universes

11.2. The Big Bang

Hubble showed that the Universe was expanding. Reverse time and you quickly come to the
conclusion that at a finite time in the past everything in the Universe was arbitrarily close (see
exercise 93). It is a well known theory that our Universe started in a hot Big Bang which starts
the expansion that we see today. What was disliked about this theory when it first came about
was the initial ‘mathematical singularity’, a starting point for the Universe of infinite density and
temperature (and a ‘centre’) but we’ll come to the inflation theory solution for that problem later
on.

This expansion will ultimately be slowed by the fundamental force of gravity. To understand this
lets consider a sphere in the Universe centred on us. A galaxy on the edge of this sphere will be
traveling with speed V away from us. We’ll use Birkoff’s theorem that states this galaxy is effected
gravitationally only by the mass within the sphere (and we can therefore ignore everything else
in the Universe out with our sphere). Birkoff’s theorem also says that the mass within a sphere
acts like a point mass at the centre.

The energy of the galaxy E is given by the sum of its kinetic and potential energies
mV 2 GM m
(11.1) E= −
2 r
where m is the mass of the galaxy, r is the radius of our toy sphere and M is the mass of the
Universe enclosed. For E < 0, gravity wins and the sphere stops expanding and re-collapses in
a big crunch. For E > 0, the kinetic energy wins and the sphere expands for ever.
141
142 INTRODUCTORY ASTROPHYSICS

Exercise 95. In the case where E < 0 write down a formula for what size our toy sphere will be
at the peak of the expansion. Clue: this will be when V = 0; the maximal point before the big
crunch commences.

11.3. Critical Density

Lets now focus on the ‘critical’ case where E = 0. In this case KE=PE and the Universes
expansion is perfectly balanced. Only as the size r → ∞, will the speed of expansion V → 0.
This critical case occurs when the density of the Universe is equal to the critical density ρcrit .
Lets put together all the information we have about this critical Universe:

Conservation of Energy:
V2 GM
(11.2) =
2 r

Mass within a sphere:


4πr3
(11.3) M= ρcrit
3
Exercise 96. Take the two equations above and Hubble’s Law (equation 10.2) to show the critical
density of the Universe today:

3H20
(11.4) ρcrit =
8πG
Use H0 = 72 km s−1 Mpc−1 and G = 6.67 × 10−11 m3 kg−1 s−2 to calculate ρcrit = 9.7 × 10−27 kg m−3

The critical density of the Universe today is roughly 6 protons per cubic metre (a proton has
mass mp = 1.67 × 10−27 kg).

Something to note here is that after the Big Bang there is no way to create new mass in the
Universe, so even though our sphere is expanding, the mass within it remains constant. Looking
at equation 11.3 this means the critical density must decrease as r increases. Looking at
equation 11.4 this means Hubble’s constant must change with time

11.4. The cosmic scale factor and redshift

We’re now going to introduce the concept of the cosmic scale factor R(t) which is the separation
between any two galaxies at time t, compared to their separation today.

r(t)
(11.5) R(t) =
r(t = today)

Exercise 97. What is R(t=today)? For an expanding Universe is R(t=past) greater than or less
than R(t=future)?
11: THE BIG BANG AND INFLATION 143

Here is a clue if you’re struggling: when R(t) = 2, all galaxies will be twice as far apart as they
are now, which in a expanding Universe must happen in the future.

We can use our conservation of energy argument to model the cosmic scale factor shown in
Figure 60:

E>0

E=0

E<0

Figure 60. The cosmic scale factor for three scenarios; E < 0 the sphere stops expanding and
re-collapses, E = 0 and E > 0 where the sphere expands for ever.

In Astronomy we often define the cosmic scale factor in terms of Redshift which is a tricky
concept particularly because the literature is very confused on the subject. In an expanding
Universe, galaxies are all moving away from us and their light therefore experiences a Doppler
shift, shifting the spectrum to the red, hence redshift. You might however read that this is not the
case, that the photons from the more distant galaxies have travelled through more expanding
space and therefore their wavelengths are stretched, increased (or redshifted) the most. This
is incorrect because it suggests that there is some extra new force in the Universe which can
physically stretch things. An expanding Universe does mean globally expanding space, but just
because space is expanding does not mean that the things in it are expanding as well! Imagine
standing on an expanding sheet of frictionless ice, just because the ice is expanding, doesn’t
mean your legs will move apart. The correct interpretation of redshift is to consider the light
from a distant galaxy experiencing a series of little Doppler shifts as it traverses the expanding
Universe towards our telescopes on planet Earth. The further away the galaxy is, the more little
Doppler shifts the emitted galaxy light experiences, and the further that light is shifted towards
the redder wavelengths.

Redshift is defined as
λo − λe
(11.6) z=
λe
where λe is the wavelength of the light emitted by the distant galaxy, λo is the wavelength of the
galaxies light that we observe from Earth.

Swapping distances r for wavelengths λ in equation 11.5;


λemit
(11.7) R(temit ) =
λobs
144 INTRODUCTORY ASTROPHYSICS

and combining with equation 11.6 we find


1
(11.8) R(t) =
1+z
Exercise 98. Calculate the redshift of the galaxy shown in Figure 59 and the cosmic scale factor
at the time when the light from this galaxy was emitted.

Astronomers use redshift to indicate both age and distance in the Universe as you increase the
distance the further back in time you peer. The most distant galaxy detected by astronomers to
date is at z ∼ 8. The light observed from this object was emitted only half a giga year after the
Big Bang.

11.5. Open, Closed and Flat Universes

Now we come on to a rather mind-bending concept of the geometry of the Universe. Lets work
in 2D and think about the surface of the Earth which for centuries people were convinced was
flat. Sailors feared falling off the edge until one day someone sailed all the way round and ended
up where they started. The geometry of the surface of the Earth was found to be curved on a
sphere rather than flat like a sheet of paper.

Now think bigger and ask what is the geometry of the Universe? If we set off in a straight line will
we end up where we started? The figure 61 and caption is taken from the fantastic NASA Big
Bang Concepts website. I couldn’t think of a better way of stating it so re-produced it verbatim!
Einstein’s theory of general relativity states that matter will bend space and time. Depending on
how much stuff there is in the Universe, there are then three different possible geometries of the
Universe; closed, open and flat.

We can define the density parameter


ρ
(11.9) Ω0 =
ρcrit
which is related to the geometry as listed below

• Closed: Ω0 > 1, E < 0, the big crunch


• Open: Ω0 < 1, E > 0, expands for ever
• Flat: Ω0 = 1, E = 0, the critical case
Exercise 99. Make sure you understand how Ω is related to the Energy E in our toy model of
the expanding sphere and how we can use that understanding to work out the future of each
geometry in the Universe

In summary we’ve used some simple physics (conservation of energy, mass in a sphere and
Hubble’s Law) to determine three different models for the Universe. Each model has a different
fate and a different geometry. Which type of Universe we live in all depends on the density of
the Universe today so if we can measure this, then just with this very simple derivation we can
determine something as fundamental as the future of our Universe! You’ve got to love physics!!

Observers measure the total density parameter Ω ∼ 1; a flat Universe expanding on this
critical path. We’ll talk more about how we measure Ω later on in the course.
11: THE BIG BANG AND INFLATION 145

Closed

Open

Flat

Figure 61. Curved Space: Figure and Caption credit reproduced from NASA Big Bang Concepts:
”Given the assumption that the matter in the universe is homogeneous and isotropic (the
Cosmological Principle) it can be shown that the corresponding distortion of space-time (due to
the gravitational effects of matter) can only have one of three forms, as shown schematically in
the picture above. It can be ”positively” curved like the surface of a ball and finite in extent; it can
be ”negatively” curved like a saddle and infinite in extent; or it can be ”flat” and infinite in extent -
our ”ordinary” conception of space. A key limitation of the picture shown here is that we can only
portray the curvature of a 2-dimensional plane of an actual 3-dimensional space! Note that in a
closed universe you could start a journey off in one direction and, if allowed enough time, ultimately
return to your starting point; in an infinite universe, you would never return.”

11.6. Skeptical Sam on the Big Bang model

The flatness problem: What pre-determined the density of our Universe so precisely, to make
our Universe so flat? The density could have taken any value, but it is instead fine-tuned to be
very close to the critical density.

The horizon problem: When we look to the horizon in one direction it looks essentially the same
and has the same temperature as the horizon in the other direction. However this implies that
at some point in the past these two regions must have been connected (otherwise they wouldn’t
have the same properties). As our horizon is defined as the furthest light can travel since the
big bang, these two regions (in this case separated by 2 horizon scales) can not have been
connected at the big bang!
146 INTRODUCTORY ASTROPHYSICS

11.7. Guth’s Inflation solution

In 1980 Alan Guth proposed a neat solution to the two problems that had plagued the Big Bang
theory since it was first proposed. He developed a theory called Inflation where by a new
form of energy from a field created roughly 10−30 seconds after the big bang, accelerated the
expansion of the Universe at speeds faster than light and then stopped. The Universe then
continued to expand at the slower Hubble rate. In the very hot early Universe lots of strange
physics is possible so this scenario is far from the realms of fiction.

Figure 62. In the very early Universe inflation causes the Universe to expand faster than the speed
of light. This initial event blows the Universe up to very large scales (left end of the diagram). The
Universe then continues to expand at the slower Hubble rate (left to right along the diagram). Figure
from U Chicago, Bryan Christie Design

Figure 62 schematically shows why this solves the horizon problem (the Universe was all connected
before inflation). Inflation also solves the flatness problem as the rapid acceleration would
essentially flatten out any irregularity in the geometry of the early post-big-bang Universe leading
to the flat Universe we see today. Inflation also solves the problem of why galaxies are distributed
unevenly throughout our Universe, and we’ll discuss this further later on.
12: THE COSMIC MICROWAVE BACKGROUND

12.1. Outline of content


• Matter-Radiation equality
• Recombination
• Black body spectrum
• Hot spots and cold spots in the CMB

12.2. Matter and Radiation

Photons in our every day life help us to see, they keep the Earth warm and at worse they
inflict sunburn injuries on a hot summers day. Photons are indeed very useful, but they are
not abundant enough today to say, kick us off our seats. In the very early hot Universe however
Photons played a very different role, and for a short time they dominated the Universe pushing
matter around!

Lets first look at the energy of matter and radiation:

• Matter Energy: E = mc2


• Photon Energy: E = hc/λ ∝ 1/R(t)

You should be familiar with Einstein’s famous result that relates energy E and mass m (c is the
speed of light). The energy of a photon is inversely proportional to its wavelength λ (h is Planck’s
constant). Remember that the recession of the galaxy away from us increases the wavelength
of the photon that we observe. An increase in wavelength decreases the energy of the photon.
The energy of the photon is therefore inversely proportional to the cosmic scale factor R(t).
Exercise 100. Write down an relationship between the energy density of matter and radiation
and the cosmic scale factor R(t).

Hint: the energy density of bananas in the Universe ρb = Number of bananas x energy of a
banana / Volume of the Universe.
147
148 INTRODUCTORY ASTROPHYSICS

The energy density of matter ρm and radiation ρr are related to the cosmic scale factor R(t):
(12.1) ρm ∝ R(t)−3 ρr ∝ R(t)−4
This means that when R(t) is small (i.e when the Universe was small and young), ρr > ρm and
the radiation dominated the Universe.
Exercise 101. At what redshift were the energy density of matter and radiation equal? We call
this era matter-radiation equality which happened at a time teq . To answer this question, use
equation 11.8. You also need to know
ρr (t = today)
(12.2) = 0.0002 ,
ρm (t = today)

ρr (t = teq )
(12.3) = 1.
ρm (t = teq )

Answer: Matter-radiation equality happened at a redshift of zeq ∼ 5000, which, for your information,
is roughly 10,000 years after the big bang.

12.3. Radiation at the era of Recombination

We’ve shown that there must have been an era in the early hot Universe where radiation dominated.
But what does this mean? Picture a soup of photons and charged particles. The photons scatter
off electrons and so we say this era was ‘opaque’ as none of the light from this era can travel out
into the Universe (it keeps hitting electrons and bouncing back).

Thanks to inflation the particles in the early Universe are not distributed randomly in this soup and
there are fluctuations. These fluctuations exist because of the tiny random quantum fluctuations
in the pea-sized Universe that existed after the big bang and before Guth’s inflation theory
rapidly expanded the Universe 10−30 s after the Big Bang. Gravity will always try to clump matter
particles together. But in the radiation dominated Universe, the photons are so dense that the
photon pressure can push the clumps apart. An ongoing battle between matter trying to clump
and photon pressure pushing these apart can be observed in the first view of the Universe at
Recombination.

Recombination is the era where the energy density of matter now dominates and the charged
particles have combined to form neutral atoms so the photons freely travel out into the expanding
Universe. We detect the first radiation from this era in the form of the cosmic microwave
background.

12.4. Cosmic Microwave Background

If the Universe started in a hot big bang, then the atoms would have interacted so strongly that
all detailed features in their energy distribution are washed out and we would expect to see a
thermal continuous blackbody spectrum from the Big Bang (for black body spectra look back to
the Stars section 1.6).
12: THE COSMIC MICROWAVE BACKGROUND 149

As the universe has expanded significantly since the Big Bang we expect to see this relic
radiation from the big bang (emitted at Recombination) isotropically across the sky and at much
longer microwave wavelengths compared to when it was emitted. This is why it is called the
cosmic microwave background or CMB. You can see the data from the COBE satellite in Figure 63
proving this theory and providing one of the strongest arguments for the Big Bang model.

Figure 63. This plot shows measurements from the COBE satellite that measured the intensity
(y-axis) of the microwave radiation in the Universe as a function of wavelength (x-axis). You can
see that the data that the data maps a perfect black body spectrum with temperature T = 2.74K.
Credit: FIRES

COBE was the first satellite to measure the spectrum of the CMB in great detail. Many missions
followed with the most ground-breaking follow-up coming from the Wilkinson Microwave Anisotropy
Probe and the Planck satellite. Figure 64 shows the Planck temperature fluctuation map of the
CMB and the hot spots (red) and cold spots (blue) that confirm our theory on the competition
between matter and radiation in the early Universe.

Figure 64. This iconic image shows a map of the temperature fluctuations in the CMB from the
Planck satellite. You can see hot spots (red - gravity wins) and cold spots (blue - photon pressure
wins) from the battle between gravity and photon pressure at recombination. Adopting a model
of the dark Universe, theorists can predict the number and scale of the ‘hot’ and ‘cold’ spots that
are observed. Confronting theoretical predictions with Planck’s observations has provided the most
precise measurements to date of the different components in our Universe finding 26.6% dark
matter, 68.5% dark energy and 4.9% baryons. Image credit: ESA/NASA
13: NUCLEOSYNTHESIS

13.1. Outline of content


• Chemical building blocks
• Deuterium abundance
• Quasar Absorption Lines
• Baryon Density

13.2. Chemical building blocks

In the very early radiation dominated Universe, protons and neutrons move fast, collide often
and fuse to form nuclei. The high density of high energy gamma ray photon however manage to
blast any nuclei that form apart. As the Universe expands and cools there should be a very short
period (between ∼ 1 and 120 seconds after the Big Bang) where the Universe is dense and hot
enough for particle collisions and nuclear fusion to occur, and the expansion has reduced the
photon energy to such an extent that they are unable to break the more stable nuclei up.

In this short period of nucleosynthesis the following nuclei, once formed, are stable.

2 3 3 4 7 7
(13.1) 1 H 1 H 2 He 2 He 3 Li 4 Be

The symbol letters represent the element (H=Hydrogen, He=Helium, Li=Lithium, Be=Beryllium)
and the upper and lower numbers are

• A
Z Element
• Atomic Number Z: number of protons
• Mass Number A: number of protons plus neutrons
Exercise 102. Starting with protons 11 p and neutrons 10 n use the stable nuclei as building blocks
to create the two most massive nucleosynthesis nuclei; Lithium 73 Li and Beryllium 74 Be. Start the
chain as follows:
1
(13.2) 1p +10 n →21 H
151
152 INTRODUCTORY ASTROPHYSICS
1
(13.3) 0n +21 H →31 H

At the end of the nucleosynthesis period the Universe has cooled to such an extent that no
heavier elements can be made as the density decreases (with the expansion) the collisions
become rare and unimportant. The rest of the chemicals on our periodic table are created much
later on in the history of the Universe in the cores of stars (see section 3.3.1).

Given this theory, a theoretical physicist can calculate with very high precision what fraction of
each of the primordial elements should form during nucleosynthesis. An observational astronomer
can then go out and put this theory to the test.

13.3. Deuterium

We’re going to focus our attention on Deuterium, also known as heavy Hydrogen 21 H. This is a
particularly interesting chemical as there is no way to produce any quantity of this element at
any other point in the history of the Universe except during nucleosynthesis. The cores of stars
do get to the required temperatures for Deuterium to be created (see reaction equation 13.2),
but they are so dense, compared to the early Universe that the chain of reactions don’t stop and
they use up all of their Deuterium in other reactions (see for example reaction equation 13.3). So
if an observer can measure the abundance of Deuterium in the Universe they can directly test
the Big Bang nucleosynthesis theory.

Figure 65. Schematic of a Quasar Absorption Line experiment where light from a distant quasar is
absorbed by Hydrogen gas in the foreground as it travels to Earth. A spectrum of the quasar (lower
panel) reveals the quasar spectrum (right) and a series of absorption lines (left) from the foreground
gas. Credit: ESO

Figure 65 shows a schematic of an observation that can measure the abundance of Deuterium.
As we expect the primordial gas to clump with other matter in the Universe, observers look for
it at the edges of galaxies. They use distant Quasars to shine light through the outer regions
of galaxies and look to see if the quasars light has been absorbed by any cool gas around the
galaxy. In the lower panel of Figure 65 you can see the quasar emission spectrum on the right;
the peak is the light emitted by the hot Hydrogen gas in the Quasar. To the left of the spectrum
we see lots of deep absorption features, caused by Hydrogen that is diffuse in the Universe, but
13: NUCLEOSYNTHESIS 153

at a different distances (the redshift of the Hydrogen line for the targeted foreground galaxy is
lower than the redshift of the Hydrogen line for the quasar).

Flux

Deuterium Hydrogen
absorption line absorption line

Wavelength (nm)

Figure 66. Zoom-in on a Hydrogen absorption line showing two components, both Hydrogen and
Deuterium. The difference between the how deep each model goes tells us the abundance of
Deuterium relative to the abundance of Hydrogen. Data from Ferlet et al 1996.

Figure shows a zoom-in of a Hydrogen absorption line from real data. The observers have
modelled this data with two features: the absorption you would expect to see from a cloud of
Hydrogen gas, and the absorption you would expect to see from a cloud of Deuterium gas. They
change the relative amount of Deuterium and Hydrogen in the cloud until the sum of the two
models reproduces the data really well. They do this for many many quasars and average the
results to produce a final measure of the abundance of the Deuterium in the Universe that was
produced minutes after the Big Bang.

13.4. Nucleosynthesis and the baryon density

Baryons is the term used to describe all of the stuff that a particle physicist understands from
the ‘standard model of particle physics’. Protons, neutrons, everything in the periodic table is
baryons. Anything that is not a baryon is deemed mysterious and we’ll come to that when we
discuss the ‘dark side’!

Our theoretical physicists calculations of the abundance of the primordial Big Bang elements
will naturally depend on the total amount of ordinary baryonic particles in the Universe, i.e the
total number of protons and neutrons that were created in the Big Bang. So not only does
nucleosynthesis provide an important test of the Big Bang theory, it also provides us a way of
measuring the total amount of baryons in the Universe.

Figure 67 is an important plot that compares theory and observations, but it is rather hard to
read. There are a number of components:
154 INTRODUCTORY ASTROPHYSICS

Figure 67. Comparison of Big Bang primordial element abundance from nucleosynthesis theory
(coloured) with observations (black boxes).

• The four coloured thick lines show the theoretical models for the abundance of Helium
4 2 3 7
2 He (green), Deuterium 1 H (red), Helium 2 He (blue) and Lithium 3 Li (purple) as created
during nucleosynthesis. These coloured bands show the value of the abundance of each
of these chemicals (y-axis) as a function of the density of baryons in the Universe today
(x-axis).
• The black boxes and arrows show the observed abundance of each element. These
are boxes rather than points because the box or arrows show you the errors on the
measurement. They are positioned along the x-axis so as to overlap with the theory.
• The light blue vertical line shows the baryon density where all the models and observations
agree.

This important result is driven by the Deuterium measure (shown red), but the fact that all the
models and measurements for all the primordial elements agree is very strong evidence for the
Big Bang model, and provides a precise measure for the density of baryons in the Universe.

Lets define the baryon density parameter as the ratio of the baryon density over the critical
density (see equation 11.4)
ρb
(13.4) Ωb =
ρcrit
From this nucleosynthesis observation we measure Ωb = 0.045.

We discussed before that astronomers think we live in a flat Universe with Ω = 1. This nucleosynthesis
result then means that baryons, the stuff we’re made up of, the stuff we understand, makes up
only the tiniest fraction of our Universe. The rest is dark and mysterious.......
14: THE DARK SIDE OF THE UNIVERSE

14.1. Outline of content


• Rotation Curves
• Non-baryonic Dark Matter
• The Distance-Redshift Relation
• Deceleration parameter
• Dark Energy and the accelerated expansion of the Universe
• Concordant Cosmology

If our Universe is flat with Ω = 1 and the baryonic component is only Ωb = 0.045, what else is
filling our Universe? There are two components called Dark Matter and Dark Energy, and we’ll
look at the evidence for both of them in turn.

14.2. Dark Matter

14.2.1. Rotation Curves

You’ve already met the concept of a halo of Dark Matter surrounding every galaxy in Galaxies
section 7.4.3. The argument for the need for Dark Matter around galaxies comes from observations
of the rotation of stars around the centre of galaxies.
Exercise 103. Show that the rotation velocity of stars a distance R from the centre of a galaxy
is given by
r
GM (< R)
(14.1) v=
R
where M (< R) is the mass of the galaxy enclosed within a radius R. Assume that the galaxy is
a thin circular disc and use Birkhoff’s theorem (that we also used in section 11.2), which in this
context states that the mass within the stars orbit acts as a point source at the centre and that
the gravitational force due to all the mass outside the stars orbit is zero.

Hint: to do this you’ll need to equate gravity with centripetal force and re-arrange.
155
156 INTRODUCTORY ASTROPHYSICS

Exercise 104. Using equation 14.1, sketch a plot of the rotation velocity of the stars as a function
of their distance from the centre of the galaxy, for a galaxy where M (< R) is constant for all R (i.e
centrally concentrated clump of stars) and a galaxy where M (< R) ∝ R. Compare your sketch
of these two rotation curves with the ’observed vs predicted’ rotation curve Figure 46. Which
model for the mass of the galaxy matches the observations?

Observations of the rotation of stars around the centre of galaxies show that the rotation velocity
is constant out to very large scales, much larger than the extent of the stars that we can see.
This means that the mass M (< R) ∝ R and that there is missing matter that we can’t see. This
missing matter has been called Dark Matter.

14.2.2. Gravitational Lensing

Einstein’s theory of General Relativity tells us that mass bends space and time. This means that
the path of light as it travels through the Universe, can be bent or lensed by foreground matter.
Figure 68 shows a lensing diagram that should be familiar to anyone who has studied optics.
The physics shown in this diagram is exactly the same as the physics of optical lenses, except
the lens here is a massive galaxy surrounded by dark matter. Light from the more distant galaxy
behind it is lensed such that for the observed on the left of the diagram, it appears that there
are two galaxies. Indeed if the background galaxy is directly behind the foreground galaxy the
observer will see a perfect Einstein Ring, which you can see an example of in the top left corner
of Figure 68.

Figure 68. Gravitational Lens Diagram: Light from the distant galaxy on the right is lensed by the
foreground galaxy (in the middle) such that for the observed on the left of the diagram sees two
galaxies. If the background galaxy is directly behind the foreground galaxy the observer will see a
perfect Einstein Ring, which you can see an example of, imaged by the Hubble Space Telescope,
in the top left corner.
14: THE DARK SIDE OF THE UNIVERSE 157

Exercise 105. Use Figure 68 to show that the observed position angle of the lensed galaxy θ is
related to the deflection angle α as
DLS
(14.2) θ=α
DS
where DLS is the distance between the lens (foreground galaxy) and the source (background
galaxy), and DS is the distance to the source. You can safely assume that θ and α are small,
such that small angle approximations apply.

From Einstein’s theory of General Relativity you can show that the deflection angle α is related
to the total mass of the lens M as
4GM
(14.3) α= 2
cr
Exercise 106. Use Figure 68 and equation 14.3 to show that the observed position angle of the
lensed galaxy θ is related to the mass of the lens M as
4GM DLS
(14.4) θ2 =
c2 D S D L

What is fantastically neat about this, is that by finding lensed galaxies, such as the ones shown
in Figure 69 we can directly measure the total mass of the lens galaxy, independent of the type
of matter that the galaxy is made up of; M is both the mass of the galaxy that we can see in
stars, plus the dark matter that we can’t see.

14.3. What is Dark Matter?

An early hypothesis was that this dark matter was cold baryonic matter bound up in dark objects
such as failed faint Brown Dwarf stars (that never reach the core temperatures to start nuclear
burning). This type of object has been extensively searched for using alternative techniques
though and never been seen in the quantities required. In addition we know from nucleosynthesis
that only ∼ 5% of the Universe is made up of baryons, so whatever this extra mass is it must be
non-baryonic.

Have we got our gravity sums right? There is a theory called Modified Newtonian Dynamics
(MOND) that states that Newtons second law should be modified to the form F = ma2 which
predicts a flat rotation curve and hence no need for dark matter. The majority of astronomers rule
out MOND as an alternative to Dark Matter though as gravity is well tested on solar system scales
and so to get MOND to work you need a tuning parameter which changes for different galaxies.
This tuning parameter goes against the Cosmological Principle that everything (including the law
of physics) is homogenous and isotropic.

Our current best guess is that Dark Matter is made up of cold non-baryonic massive particles
that only very weakly interact with the baryons that we’re made up of. It’s the strong gravitational
force in the Universe. Figure 70 shows the Bullet Cluster which provides direct evidence for this
type of dark matter particle. In the image you can see hot baryonic gas detected in X-rays (pink)
and matter (blue) as measured by looking for gravitational lensing of galaxies behind the cluster.
Most of the matter in the cluster (blue) is clearly separated from the normal matter (pink), giving
direct evidence that nearly all of the matter in the cluster is non-baryonic. If the hot gas was the
158 INTRODUCTORY ASTROPHYSICS

Figure 69. Data from the SLACS survey showing a single galaxy in each panel, comparing the
observed data (left) and the model predicted by GR (right) of hundreds of Einstein Rings spotted
with the Hubble Space Telescope. With this type of data, SLACS and other teams were able to
determine that there is about 40 times as much dark matter in a typical galaxy, compared to the
mass in stars. Credit: SLACS

most massive component in the clusters, as proposed by alternative theories of gravity such as
MOND, such an effect would not be seen.

Our final piece of evidence for the existence of dark matter is the distribution of galaxies in the
Universe. If there was no dark matter then the galaxies that we can see in the Universe wouldn’t
be distributed in the clumps and filaments that we observe today. It’s the underlying web of Dark
Matter in the Universe that is dictating when and where galaxies form in the Universe. (If you’re
interested in this check out www.cfhtlens.org).

14.4. Supernova and the accelerating expansion of the Universe

The Nobel Prize for physics in 2011 went to three distinguished cosmologists who in the 1990’s
pioneered a method to find and use distant supernova as standard candles to probe the Universe.
What they found was so startling it resulted in them winning the Nobel Prize.

A Supernova SN1a is a massive star in its death throes that always emits the same Luminosity
L (i.e energy) when its binary companion star has accreted sufficient mass on to the white dwarf
that it just exceeds the Chandrasekhar mass and explodes (see Stars section 3.5). If you can
14: THE DARK SIDE OF THE UNIVERSE 159

Figure 70. This composite image shows the bullet cluster which was formed after the collision of
two large clusters of galaxies. The pink shows the regions of normal hot X-ray emitting baryons and
the blue shows where most of the matter is. As most of the matter in the clusters (blue) is clearly
separated from the normal matter (pink), the bullet cluster gives direct evidence that nearly all of
the matter in the cluster is dark. Caption and picture credit: X-ray: NASA/CXC/CfA/M.Markevitch
et al.; Optical: NASA/STScI; Magellan/U.Arizona/D.Clowe et al.; Lensing Map: NASA/STScI; ESO
WFI; Magellan/U.Arizona/D.Clowe et al.

find them, you can measure their flux f and knowing the SN1a Luminosity L you can immediately
calculate a distance DL to the supernova;
L
(14.5) f=
4πDL2
The ‘L’ just indicates that this is a distance determined from the Luminosity. You can also take a
spectra of the supernova to determine its redshift z. This allows us to directly test Hubble’s Law
and the distance-redshift relation;

1+q 2

c(1 + z) z − 2
z
(14.6) DL '
H0

Before you panic, you’ve already seen this where you learnt D = cz/H0 when the redshift z << 1.
Equation 14.6 is just an extension of this formula for more distant galaxies (and in fact the relation
gets even more complicated for z > 1 but this extension is sufficient for our purposes). This
relation introduces q, a parameter which, in light of the Nobel Prize winners findings, is quite
ironically called the deceleration parameter;
!

(14.7) q =− 1+ 2
H
160 INTRODUCTORY ASTROPHYSICS

In section 11.3 we said that Hubble’s constant wasn’t actually a constant at all time; H in the
above equation is the changing Hubble parameter, which has a value today of H0 . Ḣ is the rate
of change of the Hubble parameter, or how fast Hubble’s constant is changing over time. Before
the supernova results, it was thought that the expansion of the Universe was decelerating, i.e
that Ḣ < 0 and q > 0.
Exercise 107. This is a really challenging one, and I don’t expect you all to be able to answer
this exercise. For those of you who like a challenge, can you use Hubble’s Law (equation 10.2)
to show why in a decelerating Universe the rate of change of the Hubble parameter Ḣ < 0. You
don’t need to use calculus.

Figure 71. The distance-redshift relation: A compilation of 740 supernova measurements of


distance DL (y-axis) and redshift z (x-axis) from four different supernova surveys (shown in blue,
magenta, cyan and red). See text for more details. Data Source: Marc Betoule and collaborators.

The upper panel of Figure 71 shows a compilation of 740 supernova measurements of distance
DL (y-axis) and redshift z (x-axis) from four different supernova surveys. Astronomers have long
measured the brightness of objects in units of magnitudes and so we’ve kept to this convention
for this Figure, where the length of each line indicates the error on each measurement of the
‘absolute magnitude’ of the supernova µ, which is related to the distance DL through µ =
5(log DL − 1). The black solid, dashed and dotted lines show the distance-redshift relation
(equation 14.6) for three different cosmological models, one accelerating with q < 0 (solid),
one decelerating with q > 0 (dotted) and finally an empty Universe that expands at constant
speed with q = 0 (dashed). With so many supernova measurements, the data points overlap.
In the lower panel we therefore show the data binned by redshift, where we have subtracted a
distance-redshift relation that assumes an empty model of the Universe, with no dark energy and
no dark matter.
14: THE DARK SIDE OF THE UNIVERSE 161

Exercise 108. Verify the values of the deceleration parameter for the three models shown in
Figure 71 using the numbers in the inset of the Figure and the equation.
Ωm
(14.8) q'− ΩΛ .
2
Which model is the best fit to the data and what can you conclude about the expansion of the
Universe from these results?

The supernova data showed that the high redshift supernova are significantly further away from
where they are expected to be in a universe without dark matter and dark energy. The conclusion
is that the expansion of our Universe can not be decelerating, or expanding at constant speed.
The expansion is in fact accelerating, fuelled by a mysterious source of dark energy. We quantify
the amount of Dark Energy through the density parameter ΩΛ .
Exercise 109. Use the distant-redshift relation in equation 14.6 to calculate the distance to a
supernova at z = 1 for two cases; (i) a Universe with a decelerating expansion Ωm = 0.3, ΩΛ = 0.0
and (ii) a Universe with an accelerated expansion Ωm = 0.3, ΩΛ = 0.7. In which Universe is the
distant supernova furthest away?

14.5. What is Dark Energy


• The simplest explanation is simply that the vacuum in the Universe has an energy
associated with it that can cause this accelerated expansion. But if you ask a particle
physicist how much energy a vacuum has, their answer is 1060 orders of magnitude
different from the dark energy we measure.
• It could be that the Universe is experiencing a new period of Inflation (as it did in the early
Universe). Maybe a new field has recently kicked into action to cause this accelerated
expansion.
• Could it be that our Universe is just one Universe in a Multiverse of Universes and the
Universe next to ours is sucking ours into it???
• Does gravity work on cosmological scales in the same way it works in our solar system?
Maybe we need to go beyond Einstein with our theory of gravity?

The jury is out on what Dark Energy is, but its widely believed that our final understanding of the
Dark Universe will have to involve new physics that will forever change our view of the Universe.

14.6. A concordant cosmology

The supernova results and the rotation curve measurements are not the only observations to
indicate the need for Dark Matter and Dark Energy in our Universe. Analysis of the CMB (see
section 11.7) in addition to the analysis of how galaxies clump together all point to the same type
of Universe:

• The baryonic content is very small Ωb ∼ 0.05


• The total dark and baryonic matter content Ωm ∼ 0.3
• The dark energy content ΩΛ ∼ 0.7
• The Universe is flat and has critical density Ω = Ωb + Ωm + ΩΛ = 1
162 INTRODUCTORY ASTROPHYSICS

This is referred to as the ‘concordant cosmology’ as many different experiments all find the same
results as shown in Figure 72.

Supernova 1a
CMB
CMB + BAO

Figure 72. The concordant cosmology as measured from supernova results (SNe - blue), the
cosmic microwave background (CMB - green) and looking at how galaxies clump (BAO - red).
The different coloured regions show which models of the Universe agree with the data from each
experiment. The three coloured regions all intercept (shown red) which pinpoints our best measure
of what makes up the Universe.

The introduction of Dark Energy into our model of the Universe change our analysis of the
evolution of the cosmic scale factor R(t) in section 11.4 where we considered only matter in
the Universe. Figure 73 shows the evolution of the cosmic scale factor in a Universe with Dark
Energy, showing the kick-up acceleration that Dark Energy introduces.
Exercise 110. What are the similarities and differences between the evolution of the cosmic
scale factor shown in Figure 60 (a Universe without Dark Energy) with the evolution of the cosmic
scale factor shown in Figure 73 (a Universe with Dark Energy)?

14.7. The Future of the Universe

We could end this course with a rather sad thought. With our current understanding of dark
energy (which is illustrated as the constant dark energy model in Figure 73) the ultimate fate of
the Universe is cold and dark. The expansion of the Universe will continue accelerating such
that the Universe gets bigger, colder and darker as the stars burn out their last fuel. It will then
be a very very cold and empty place......

How I’d rather end this book however, is by telling you that my bet is that our current understanding
of the Universe is incomplete. Over the next decade with new telescopes on both the ground
14: THE DARK SIDE OF THE UNIVERSE 163

Figure 73. Plot of the cosmic scale factor R (y-axis) as a function of time for a Universe containing
Dark Energy. Picture and caption from NASA, Chandra: This illustration shows three possible
futures for the Universe, depending on the behaviour of dark energy, by showing how the scale
of the Universe may change with time. If dark energy is constant the expansion should continue
accelerating forever. If dark energy increases, the acceleration may happen so quickly that galaxies,
stars, and eventually atoms will be torn apart, in the so-called Big Rip. Dark energy may also lead to
a re-collapse of the Universe, in the Big Crunch. The illustration also shows the early decelerating
expansion of the Universe, followed by the dark energy induced accelerating phase that started
about 6 billion years ago.

and in space, my hope is that we find something truely ground-breaking and unexpected about
the Universe. Perhaps our model of gravity is incomplete and when we finish this puzzle we
will have turned our understanding of the dark universe on its head. My fear, however, is that
we’ll find no deviations from the concordance model that Planck has already clearly shown to
fully explain the Universe right after the big bang. I fear this as, in my opinion, the most solid
theoretical reasoning for why the dark energy should be as small as it is, is grounded in a theory
that predicts an almost infinite number of multiple universes. Each one of these bubble universes
features a different realisation of the constants that determine the amplitudes of the fundamental
forces. We imagine that our Universe is the only reality, but perhaps the reason why we exist
at all is because in our realisation the fundamental constants, including ΩΛ , are well tuned for
life. As an observer this is a hard concept to swallow as it cannot be directly tested. However
there is some glimmer of hope as it a branch of the different inflation theories that infer this
multiverse conclusion, and with next generation of CMB experiments planned for the coming
decades, these theories can and will be rigorously confronted.
15: SUMMARY OF IMPORTANT THINGS TO REMEMBER

Cosmology is built on the following two principles:

• Copernican Principle - we do not live at the centre of the Universe


• Cosmological Principle - the Universe is homogenous and isotropic on large scales
(at any instant the Universe has the same properties everywhere).

Hubble’s law: Each galaxy is traveling away from every other galaxy at speed V which is directly
proportional to the distance between the galaxies D.

(15.16) V = H0 D

The Universe could be infinite, but the part of the Universe that we can observe is certainly finite.
The observable extent of the Universe that we can see we call the Horizon.

The cosmic scale factor R(t) is the separation between any two galaxies at time t, compared to
their separation today.
r(t) 1
(15.17) R(t) = =
r(t = today) 1+z
You should know how to derive this relationship between the cosmic scale factor and redshift.

The density parameter is related to the geometry of the Universe and its fate.
ρ
(15.18) Ω0 =
ρcrit

• Closed: Ω0 > 1, E < 0, the big crunch


• Open: Ω0 < 1, E > 0, expands for ever
• Flat: Ω0 = 1, E = 0, the critical case

You should know how to derive a formula for ρcrit using the conservation of energy and mass
within a sphere in section 11.3.

.
165
166 INTRODUCTORY ASTROPHYSICS

The energy of matter and radiation:

• Matter Energy: E = mc2


• Photon Energy: E = hc/λ ∝ 1/R(t)

Big Bang nucleosynthesis forms the following stable nuclei ∼ 3 seconds after the Big Bang.
2 3 3 4 7 7
(15.19) 1 H 1 H 2 He 2 He 3 Li 4 Be

Deuterium (21 H) can not be formed in any quantity at any other time in the history of the Universe.

The baryon density parameter


ρb
(15.20) Ωb =
ρcrit
From nucleosynthesis we measure Ωb = 0.045 implying our Flat Universe with Ω = 1 is made up
of something much more exotic than the baryonic particles that we’re familiar with on Earth.

The rotation velocity of stars a distance R from the centre of a galaxy is given by
r
GM (< R)
(15.21) v=
R
where M (< R) is the mass of the galaxy enclosed within a radius R. You should know how to
derive this from equating potential and kinetic energy.

The concordant cosmology model of our Universe tells us that it is flat and it is made up of

• Baryons Ωb ∼ 0.05
• Dark Matter Ωm ∼ 0.25
• Dark Energy ΩΛ ∼ 0.7

With this model the ultimate fate is for the expansion of the Universe to continue accelerating
such that the Universe gets colder and darker as the stars burn out their last fuel.

You might also like