You are on page 1of 6

Y A I K H O M NOTES • WORK • ABOUT

The sum of squares


September 09, 2012

I have wondered how the closed form for the sum of squares for the first n
natural numbers was derived. Given the formula for the sum,

12 + 22 + 32 + ⋯ + n2 = n(n + 1)(2n + 1)/6

I learned to prove its correctness using mathematical induction. However, I


never understood how the formula was derived in the first place! Knuth writes
in The Art of Computer Programming, Volume 1, Third Edition, pp. 32 [Addison-
Wesley, June 2009]:

Most textbooks would simply state those formulas, and prove them by
induction. Induction is, of course, a perfectly valid procedure; but it does
not give any insight into how on earth a person would ever have
dreamed the formula up in the first place, except by some lucky guess. …

Carl Friedrich Gauss used the following approach, apparently while still in
primary school, to derive a closed form for the sum of the first n positive
natural numbers, S = 1 + 2 + 3 + ⋯ + (n − 2) + (n − 1) + n. He first
reordered the sum as S ′ = n + (n − 1) + (n − 2) + ⋯ + 3 + 2 + 1, and
added S and S ′ , giving
S + S ′ = (1 + n) + (1 + n) + ⋯ + (1 + n) = n(1 + n). Since S = S ′ , we
get the sum as S = n(n + 1)/2. So, are there similarly simple derivations for
the sum of squares? As it turns out, there are several.

A simple derivation

While randomly browsing the web, I fortunately stumbled upon a bulletin


message1 that gave a simple derivation of the closed form, as follows:

Let S be the sum of squares of the first n natural numbers, such that

S = 12 + 22 + 32 + ⋯ + n2 .
Our aim is to derive a closed form formula for S in terms of n. We have

(x + 1)3 = (x + 1)(x2 + 2x + 1)
= x3 + 2x2 + x + x2 + 2x + 1
= x3 + 3x2 + 3x + 1.

Using this, we write the following sequence of equations where 1 ≤ x ≤ n:

(1 + 1)3 = 13 + 3.12 + 3.1 + 1


(2 + 1)3 = 23 + 3.22 + 3.2 + 1
(3 + 1)3 = 33 + 3.32 + 3.3 + 1

(n + 1) = n3 + 3.n2 + 3.n + 1
3

If we add up each of the five columns separately, we get

(1 + 1)3 + (2 + 1)3 + ⋯ + (n + 1)3 = 13 + 23 + ⋯ + n3 +


3.(12 + 22 + ⋯ + n2 )+
3.(1 + 2 + 3 + ⋯ + n)+
1 + 1 + 1 + ⋯ + 1.

After simplification and substitution, we get

23 + 33 + ⋯ + (n + 1)3 = 13 + 23 + ⋯ + n3 +
3S + 3n(n + 1)/2 + n

which, after subtracting the sum of cubes, gives us

(n + 1)3 = 1 + 3S + 3n(n + 1)/2 + n, or


3S = (n + 1)3 − 3n(n + 1)/2 − n − 1, or
6S = 2n3 + 3n2 + n, finally giving us
S = n(n + 1)(2n + 1)/6.

Some history

After studying further, I found out that there are several methods for deriving
this and similar results. In Concrete Mathematics by Graham, Knuth, and
Patashnik, Second Edition, pp. 41–46 [Addison-Wesley, December 2004],
several different approaches are discussed. Some involve simple
manipulations of sums, while others require integral calculus.

As the history goes, Archimedes derived the formula for the sum of squares a
long time ago. I have investigated the derivations and proofs in two principal
translations of Archimedes’ work: The Works of Archimedes by T. L. Heath
[Cambridge University Press, 1897], and Archimedes by E. J. Dijksterhuis
[Copenhagen: Ejnar Munksgaard, 1956].

Dijksterhuis provides a more faithful translation of the original writing in Greek;


however, the work due to Heath presents a fuller study using modern
formulation. In fact, with regards to understanding the derivations, Heath’s
exposition is easier to follow, as compared to Dijksterhuis’ use of cryptic
notations (for instance, Dijksterhuis denotes the square of a value x as T (x)
instead of as x2 , which makes the equations harder to follow). Furthermore,
while Dijksterhuis’ treatment ignores some of the steps in the derivation,
Heath gives a fuller account of each step.

At the end of the introduction to the treatise On Conoids and Spheroids


(pp. 119 in Dijksterhuis’ translation of De Conoidibus et Sphaeroidibus),
Archimedes mentions the following lemma without proof:

If any number of magnitudes be given, which exceed one another by an


equal amount equal to the least, and also other magnitudes, equal in
number to the former, but each equal in quantity to the greatest, all the
magnitudes each of which is equal to the greatest will be less than the
duplicate of all those exceeding one another by an equal amount and
more than the duplicate of these minus the greatest.

This is another way of saying Σni=1 i = n(n + 1)/2, which we discussed


previously using Gauss’ derivation2. Then, in proposition 10 of the treatise On
Spirals (Dijksterhuis’ translation of De lineis Spiralibus), Archimedes states and
proves the sum of squares:

If a series of any number of lines be given, which exceed one another by


an equal amount; and the difference be equal to the least, and if other
lines be given equal in number to these and in quantity to the greatest,
the squares on the lines equal to the greatest, plus the square on the
greatest and the rectangle contained by the least and the sum of all
those exceeding one another by an equal amount will be the triplicate of
all the squares on the lines exceeding one another by an equal amount.

In modern formulation, as given on pp. 107–109 of Heath’s translation, the


derivation proceeds from:

If A1 , A2 , A3 , … , An be n lines forming an ascending arithmetical


progression in which the common difference is equal to the least term
A1 , then
(n + 1)A2n + A1 (A1 + A2 + A3 + ⋯ + An )
= 3(A21 + A22 + A23 + ⋯ + A2n )

to produce,

(n + 1)(na)2 + a(a + 2a + ⋯ + na) = 3[a2 + (2a)2 + ⋯ + (na)2 ]

where a is the common difference. From this, Archimedes finally derives the
result Σni=1 i2 = n(n + 1)(2n + 1)/6 .

1. I have discussed the cited works in the order that I discovered them,
beginning with the bulletin message (http://mathforum.org/library/drmath
/view/56920.html) I read on 9 June 2010 Wednesday, 23:49↩
2. Did Gauss know about Archimedes’ work? In Dijksterhuis’ translation, a
diagrammatic argument is presented using two sequences of bars, each
representing a sequence of natural number, one ascending and the other
descending, which when combined produce a rectangle. It is easy to see
that this is a visual representation of Gauss’ derivation.↩
0 Comments Yaikhom 
1 Login

Sort by Best
 Recommend 6 ⤤ Share

Start the discussion…


LOG IN WITH
OR SIGN UP WITH DISQUS ?

Name

Be the first to comment.

ALSO ON YAIKHOM

Lean web services using C, Something about matrices


FastCGI and MySQL 2 comments • 5 years ago
10 comments • 4 years ago gyaikhom — For square matrices,
gyaikhom — Thank you. I have yes. But for non-square matrices,
updated the article. you'll have to resize the existing

Image segmentation using the Non-recursive tree traversal


DBSCAN clustering algorithm without a Stack
2 comments • 3 years ago 1 comment • 4 years ago
Fangliang Bai — Thank you for Michał — The size of stack need
posting your work. I love it. Will to be equal to the height of tree,
keep focusing your future post not to the number of all nodes.

✉ Subscribe d Add Disqus to your siteAdd DisqusAdd


🔒 Disqus' Privacy PolicyPrivacy PolicyPrivacy

You might also like