Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Spatiotemporal Data Analysis
Spatiotemporal Data Analysis
Spatiotemporal Data Analysis
Ebook520 pages4 hours

Spatiotemporal Data Analysis

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

A severe thunderstorm morphs into a tornado that cuts a swath of destruction through Oklahoma. How do we study the storm's mutation into a deadly twister? Avian flu cases are reported in China. How do we characterize the spread of the flu, potentially preventing an epidemic? The way to answer important questions like these is to analyze the spatial and temporal characteristics--origin, rates, and frequencies--of these phenomena. This comprehensive text introduces advanced undergraduate students, graduate students, and researchers to the statistical and algebraic methods used to analyze spatiotemporal data in a range of fields, including climate science, geophysics, ecology, astrophysics, and medicine.


Gidon Eshel begins with a concise yet detailed primer on linear algebra, providing readers with the mathematical foundations needed for data analysis. He then fully explains the theory and methods for analyzing spatiotemporal data, guiding readers from the basics to the most advanced applications. This self-contained, practical guide to the analysis of multidimensional data sets features a wealth of real-world examples as well as sample homework exercises and suggested exams.

LanguageEnglish
Release dateDec 5, 2011
ISBN9781400840632
Spatiotemporal Data Analysis
Author

Gidon Eshel

Gidon Eshel is Bard Center Fellow at Bard College.

Related to Spatiotemporal Data Analysis

Related ebooks

Mathematics For You

View More

Related articles

Related categories

Reviews for Spatiotemporal Data Analysis

Rating: 3 out of 5 stars
3/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Spatiotemporal Data Analysis - Gidon Eshel

    am.

    PART 1


    Foundations

    ONE


    Introduction and Motivation

    BEFORE YOU START working your way through this book, you may ask yourself—Why analyze data? This is an important, basic question, and it has several compelling answers.

    The simplest need for data analysis arises most naturally in disciplines addressing phenomena that are, in all likelihood, inherently nondeterministic (e.g., feelings and psychology or stock market behavior). Since such fields of knowledge are not governed by known fundamental equations, the only way to generalize disparate observations into expanded knowledge is to analyze those observations. In addition, in such fields predictions are entirely dependent on empirical models of the types discussed in chapter 9 that contain parameters not fundamentally constrained by theory. Finding these models’ numerical values most suitable for a particular application is another important role of data analysis.

    A more general rationale for analyzing data stems from the complementary relationship of empirical and theoretical science and dominates contexts and disciplines in which the studied phenomena have, at least in principle, fully knowable and usable fundamental governing dynamics (see chapter 7). In these contexts, best exemplified by physics, theory and observations both vie for the helm. Indeed, throughout the history of physics, theoretical predictions of yet unobserved phenomena and empirical observations of yet theoretically unexplained ones have alternately fixed physics’ ropes.¹ When theory leads, its predictions must be tested against experimental or observational data. When empiricism is at the helm, coherent, reproducible knowledge is systematically and carefully gleaned from noisy, messy observations. At the core of both, of course, is data analysis.

    Empiricism’s biggest triumph, affording it (ever so fleetingly) the leadership role, arises when novel data analysis-based knowledge—fully acquired and processed—proves at odds with relevant existing theories (i.e., equations previously thought to govern the studied phenomenon fail to explain and reproduce the new observations). In such cases, relatively rare but game changing, the need for a new theory becomes apparent.² When a new theory emerges, it either generalizes existing ones (rendering previously reigning equations a limiting special case, as in, e.g., Newtonian vs. relativistic gravity), or introduces an entirely new set of equations. In either case, at the root of the progress thus achieved is data analysis.

    Once a new theory matures and its equation set becomes complete and closed, one of its uses is model-mediated predictions. In this application of theory, another rationale for data analysis sometimes emerges. It involves phenomena (e.g., fluid turbulence) for which governing equations may exist in principle, but their applications to most realistic situations is impossibly complex and high-dimensional. Such phenomena can thus be reasonably characterized as fundamentally deterministic yet practically stochastic. As such, practical research and modeling of such phenomena fall into the first category above, that addressing inherently nondeterministic phenomena, in which better mechanistic understanding requires better data and better data analysis.

    Data analysis is thus essential for scientific progress. But is the level of algebraic rigor characteristic of some of this book’s chapters necessary? After all, in some cases we can use some off-the-shelf spreadsheet-type black box for some rudimentary data analysis without any algebraic foundation. How you answer this question is a subjective matter. My view is that while in a few cases some progress can be made without substantial understanding of the underlying algebraic machinery and assumptions, such analyses are inherently dead ends in that they can be neither generalized nor extended beyond the very narrow, specific question they address. To seriously contribute to any of the progress routes described above, in the modular, expandable manner required for your work to potentially serve as the foundation of subsequent analyses, there is no alternative to thorough, deep knowledge of the underlying linear algebra.

    ¹ As beautifully described in Feuer, L. S. (1989) Einstein and the Generations of Science, 2nd ed., Transaction, 390 pp., ISBN-10: 0878558993, ISBN-13: 978-0878558995, and also, with different emphasis, in Kragh, H. (2002) Quantum Generations: A History of Physics in the Twentieth Century, Princeton University Press, Princeton, NJ, 512 pp., ISBN13: 978-0-691-09552-3.

    ² Possibly the most prominent examples of this route (see Feuer’s book) are the early development of relativity partly in an effort to explain the Michelson-Morley experiment, and the emergence of quantum mechanics for explaining blackbody radiation observations.

    TWO


    Notation and Basic Operations

    WHILE ALGEBRAIC BASICS can be found in countless texts, I really want to make this book as self contained as reasonably possible. Consequently, in this chapter 1 introduce some of the basic players of the algebraic drama about the unfold, and the uniform notation I have done my best to adhere to in this book. While chapter 3 is a more formal introduction to linear algebra, in this introductory chapter 1 also present some of the most basic elements, and permitted manipulations and operations, of linear algebra.

    1.  Scalar variables: Scalars are given in lowercase, slanted, Roman or Greek letters, as in a, b, x, a, β, θ.

    2.  Stochastic processes and variables: A stochastic variable is denoted by an italicized uppercase X. A particular value, or realization, of the process X is denoted by x.

    3.  Matrix variables: Matrices are the most fundamental building block of linear algebra. They arise in many, highly diverse situations, which we will get to later. A matrix is a rectangular array of numbers, e.g.,

    A matrix is said to be M×N (M by N) when it comprises M rows and N columns. A vector is a special case of matrix for which either M or N equals 1. By convention, unless otherwise stated, we will treat vectors as column vectors.

    4.  Fields: Fields are sets of elements satisfying the addition and multiplication field axioms (associativity, commutativity, distributivity, identity, and inverses), which can be found in most advanced calculus or abstract algebra texts. In this book, the single most important field is the real line, the set of real numbers, denoted by . Higher-dimensional spaces over are denoted by N.

    5.  Vector variables: Vectors are denoted by lowercase, boldfaced, Roman letter, as in a, b, x. When there is risk of ambiguity, and only then, I adhere to normal physics notation, and adorn the vector with an overhead arrow, as in . Unless specifically stated otherwise, all vectors are assumed to be column vectors,

    where a is said to be an M-vector (a vector with M elements); means equivalent to; ai is a’s ith element (1 ≤ i M); means an element of, so that the object to its left is an element of the object to its right (typically a set); and M is the set (denoted by {·}) of real M-vectors

    M is the set of all M-vectors a of which element i, ai, is real for all i (this is the meaning of i). Sometimes, within the text, I use a = (a1 a2 … aM)T (see below).

    6.  Vector transpose: For

    where aT is pronounced "a transpose."

    7.  Vector addition: If two vectors share the same dimension N (i.e., a N and b N), then their sum or difference c is defined by

    8.  Linear independence: Two vectors a and b are said to be linearly dependent if there exists a scalar α such that a = αb. For this to hold, a and α b must be parallel. If no such α exists, a and b are linearly independent.

    In higher dimensions, the situation is naturally a bit murkier. The elements of a set of K N vectors, are linearly dependent if there exists a set of scalars not all zero, which jointly satisfy

    where the right-hand side is the N zero vector. If the above is only satisfied for αi = 0 i (i.e., if the above only holds if all αs vanish), the elements of the set {vi } are mutually linearly independent.

    9.  Inner product of two vectors: For all practical data analysis purposes, if two vectors share the same dimension N as before, their dot, or inner, product, exists and is the scalar

    (where ¹ is often abbreviated as ).

    10. Projection: The inner product gives rise to the notion of the projection of one vector on another, explained in fig. 2.1.

    11. Orthogonality: Two vectors u and v are mutually orthogonal, denoted u v, if uTv = vTu = 0. If, in addition to uTv = vTu = 0, uTu = vTv = 1, u and v are mutually orthonormal.

    12. The norm of a vector: For any p , the p-norm of the vector a N is

    where the real scalar |ai| is the absolute value of a’s ith element.

    Most often, the definition above is narrowed by setting p 1, where 1 is the set of positive natural numbers, 1 = {1, 2, 3, … }.

    A particular norm frequently used in data analysis is the L² (also denoted L2), often used interchangeably with the Euclidean norm,

    where above I use the common convention of omitting the p when p = 2, i.e., using as a shorthand for The term Euclidean norm refers to the fact that in a Euclidean space, a vector’s L²-norm is its length. For example, consider r = ( 1 2 )T shown in fig. 2.2 in its natural habitat, ², the geometrical two-dimensional plane intuitively familiar from daily life. The vector r connects the origin, (0, 0), and the point, (1, 2); how long is it?! Denoting that length by r and invoking the Pythagorean theorem (appropriate here because x y in Euclidean spaces),

    which is exactly

    Figure 2.1. Projection of a = ( 22 29 )T (thick solid black line) onto b = ( 22 3 )T (thick solid gray line), shown by the thin black line parallel to b, p ≡ [(aT b)/(bT b)]b = (aT ) . The projection is best visualized as the shadow cast by a on the b direction in the presence of a uniform lighting source shining from upper left to lower right along the thin gray lines, i.e., perpendicular to b. The dashed line is the residual of a, r = a p, which is normal to p, (a — p)Tp = 0. Thus, p = a (a’s part in the direction of b) and r = a (a’s part perpendicular to b), so p and r form an orthogonal split of a.

    Figure 2.2. A schematic representation of the Euclidean norm as the length of a vector in ².

    demonstrating the length of a vector interpretation of the L²-norm.

    13. Unit vectors: Vectors of unit length,

    where a, â ≠ 0 N, are called unit vectors and are adorned with an overhat.

    Note that

    by construction.

    14. Matrix variables: Matrices are denoted by uppercase, boldfaced, Roman letters, as in A, B, M. When there is any risk of ambiguity, and only then, I adorn matrix variables with two underlines, as in

    Unless otherwise explicitly stated due to potential ambiguity, matrices are considered to be M × N (to have dimensions M by N), i.e., to have M rows and N columns,

    where aij is A’s real scalar element in row i and column j.

    We sometimes need a column-wise representation of a matrix, for which the notation is

    where the ith column is ai , and 1 ≤ i N.

    15. Matrix addition: For C = A ± B to be defined, A and B must have the same dimensions. Then, C inherits these dimensions, and its elements are cij = aij ± b ij.

    16. Transpose of a matrix: The transpose of

    = (a1 a2 Ι… aN) M×N,        (2.19)

    where ai is

    so that A’s element ij is equal to AT’s element ji.

    17. Some special matrices:

    •  Square diagonal (M = N):

    •  Rectangular diagonal, M > N:

    i.e.,

    •  Rectangular diagonal, M < N:

    i.e.,

    •  Square symmetric, M = N:

    18. Matrix product: AB is possible only if A and B share their second and first dimensions, respectively. That is, for AB to exist A M×N, B N×K, where M and K are positive integers, must hold. When the matrix multiplication is permitted,

    where AB M×K, and all sums run over [1, N], i.e., Σ is shorthand for

    If we denote A’s ith row by aiT and B’s jth column by bj and take advantage of the summation implied by the inner product definition, AB can also be written more succinctly as

    To check whether a given matrix product is possible, multiply the dimensions: if AB is possible, its dimensions will be (M × N)(N × K) ~ (M × )( × K) ~ M × K, where ~ means loosely goes dimensionally as, and the crossing means that the matching inner dimension (N in this case) is annihilated by the permitted multiplication (or, put differently, N is the number of terms summed when evaluating the inner product of A’s ith row and B’s jth column to obtain AB’s element ij). When there is no cancellation, as in CD ~ (M × N)(J X K), J N, the operation is not permitted and CD does not exist.

    In general, matrix products do not commute; AB BA. One or both of these may not even be permitted because of failure to meet the requirement for a common inner dimension. For this reason, we must distinguish post- from premultiplication: in AB, A premultiplies B and B postmultiplies A.

    19. Outer product: A vector pair {a M, b N} can generate

    where abT is the outer product of a and b. (A more formal and general notation is C = a b. However, in the context of most practical data analyses, a b and abT are interchangeable.) Expanded, the outer product is

    a degenerate form of eq. 2.28. (The above C matrix can only be rank 1 because it is the outer product of a single vector pair. More on rank later.)

    20. Matrix outer product: By extension of the above with ai M and bi N denoting the ith columns of A MXj and B NXJ,

    where the summation is carried out along the annihilated inner dimension, i.e., Because the same summation is applied to each term, it can be applied to the whole matrix rather than to individual elements. That is, C can also be expressed as the J element series of M × N rank 1 matrices

    It may not be obvious at first, but the jth element of this series is ajbjT. To show this, recall that

    and

    so that

    the jth element of the series in eq. 2.33. That is,

    Because some terms in this sum can be mutually redundant, C’s rank need not be full.

    THREE


    Matrix Properties, Fundamental Spaces, Orthogonality

    3.1 VECTOR SPACES

    3.1.1 Introduction

    FOR OUR PURPOSES, it is sufficient to think of a vector space as the set of all vectors of a certain type. While the vectors need not be actual vectors (they can also be functions, matrices, etc.), in this book vectors are literally column vectors of real number elements, which means we consider vector spaces over .

    The lowest dimensional vector space is ⁰, comprising a single point, 0; not too interesting. In , the real line, one and only one kind of inhabitant is found: 1-vectors (scalars) whose single element is any one of the real numbers from — ∞ to ∞. The numerical value of v ("v which is an element of R-one") is the distance along the real line from the origin (0, not boldfaced because it is a scalar) to v. Note that the rigid distinction between scalars and vectors, while traditional in physics, is not really warranted because R contains vectors, just like any other N, but they all point in a single direction, the one stretching from – ∞ to ∞.

    Next up is the familiar geometrical plane, or ² (fig. 3.1), home to all 2-vectors. Each 2-vector ( x y )T connects the origin (0, 0) and the point (x, y) on the plane. Thus, the two elements are the projections of the vector on the two coordinates (the dashed projections in fig. 3.1). Likewise, ³, the three-dimensional Euclidean space in which our everyday life unfolds, is home to 3-vectors v = ( v1 v2 v3 )T stretched in three-dimensional space between the origin (0, 0, 0) and (v1, v2, v3). While N≥4 may be harder to visualize, such vector spaces are direct generalizations of the more intuitive ² or ³.

    Vector spaces follow a few rules. Multiplication by a scalar and vector addition are defined, yielding vectors in the same space: with α , u N and v N, αu N and (u+v) N are defined. Addition is commutative (u+v = v+u) and associative (u +(v+w) = w+(u+v) = v +(u+w) or any other permutation of u, v, and w). There exists a zero-vector 0 satisfying v + 0 = v, and vectors and their negative counterparts (additive inverses; unlike scalars, vectors do not have multiplicative inverse, so 1/u is meaningless) satisfy v+(–v) = 0. Multiplication by a scalar is distributive, α (u + v) = αu + α v and (α + β)u = αu + ßu, and satisfies α(βu) = (α β)u = αβu. Additional vector space rules and axioms, more general but less germane to data analysis, can be found in most linear algebra texts.

    Figure 3.1. Schematic of ². The vector (thick line) is an arbitrarily chosen u = ( 4 5 )T ². The vector components of u in the direction of and , with (scalar) magnitudes given by uT and uT , are shown by the dashed horizontal and vertical lines, respectively.

    3.1.2 Normed Inner-Product Vector Spaces

    Throughout this book we will treat N as a normed inner-product vector space, i.e., one in which both the norm and the inner product, introduced in chapter 2, are well defined.

    3.1.3 Vector Space Spanning

    An N-dimensional vector space is minimally spanned by a particular (nonunique) choice of N linearly independent N vectors in terms of which each N vector can be uniquely expressed. Once the choice of these N vectors is made, the vectors are collectively referred to as a basis for N, and each one of them is a basis vector. The term spanning refers to the property that because of their linear independence, the basis vectors can express—or span—any arbitrary N vector. Pictorially, spanning is explained in fig. 3.2. Imagine a (semi-transparent gray) curtain suspended from a telescopic rod attached to a wall (left thick vertical black line). When the rod is retracted (left panel), the curtain collapses to a vertical line, and is thus a one-dimensional object. When the rod is extended (right panel), it spans the curtain, which therefore becomes two dimensional. In the former (left panel) case, gravity is the spanning force, and—since it operates in the up–down direction—the curtain’s only relevant dimension is its height, the length along the direction of gravity. In the extended case (right panel), gravity is joined by the rod, which extends, or spans, the curtain sideways. Now the curtain has two relevant dimensions, along gravity and along the rod. These two thus form a spanning set, a basis, for the two-dimensional curtain.

    Figure 3.2. Schematic explanation of vector space spanning by the basis set, discussed in the text.

    Let us consider some examples. For spanning ³, the Cartesian basis set

    (sometimes denoted ) is often chosen. This set is suitable for spanning ³ because any ³ vector can be expressed as a linear combination of :

    Note, again, that this is not a unique choice for spanning ³; there are infinitely many such choices. The only constraint on the choice, again, is that to span ³, the 3 vectors must be linearly independent, that is, that no nontrivial {α, β, γ} satisfying can be found.

    The requirement for mutual linear independence of the basis vectors follows from the fact that a 3-vector has 3 independent pieces of information, v1, v2, and v3. Given these 3 degrees of freedom (three independent choices in making up v; much more on that later in the book), we must have 3 corresponding basis vectors with which to work. If one of the basis vectors is a linear combination of other ones, e.g., if = α say, then and no longer represent two directions in ³, but just one. To show how this happens, consider the choice

    which cannot represent ( 0 0 v3 )T. Thus, this choice of a basis doesn’t span ³ (while it does span ² ⊃ ³, just as well as alone, for 3 vectors to span ², a two-dimensional subspace of ³, is not very impressive). To add a third basis vector that will complete the spanning of ³, we need a vector not contained in any z = constant plane. Fully contained within the z = 0 plane already successfully spanned by the previous two basis vectors, ( 2 3 0 )T doesn’t help.

    Note that the above failure to span ³ is not because none of our basis vectors has a nonzero third element; try finding {α, β, γ} satisfying

    (i.e., consider the ³ spanning potential of the above three ³ vectors). The second and third rows give

    v2 = α+ γ γ = v2 – a and v3 = α β β α – v3,

    so the first row becomes

    v1 = α + β + 2γ = α + α – v3 + 2v2 – 2α = 2v2 – v3.

    Thus, the considered set can span the subset of ³ vectors of the general form (2v2 – v3 v2 v3 )T, but not arbitrary ones (for which v1 ≠ 2v2 – v3). This is because

    i.e., the third spanning vector in this deficient spanning set, the sum of the earlier two, fails to add a third dimension required for fully spanning ³.

    To better understand the need for linear independence of basis vectors, it is useful to visualize the geometry of the problem. Consider

    which fail to span ³, because k is linearly dependent on i and j. What does this failure look like? While this more interesting and general situation is not obvious to visualize—the redundancy occurs in a plane parallel to neither of ( 1 0 0 )T, ( 0 1 0 )T, or ( 0 0 1 )T but inclined with respect to all of them—visualization may be facilitated by fig. 3.3. (We will learn later how to transform the coordinates so that the redundant plane becomes a fixed value of one coordinate, which we can then eliminate from the problem, thus reducing the apparent dimensionality, 3, to the actual dimensionality, 2.)

    Now let’s go back to the easier to visualize vectors = ( 1 0 0 )T, = ( 0 1 0 )T. We have realized above that to assist in spanning ³, the additional basis vector must not be fully contained within any z = constant plane. To meet this criterion,

    i.e., must have a nonzero remainder after subtracting its projections on and j. Because and , this requirement reduces to

    which can vanish only when 3 = 0. Thus, any with nonzero 3 will complement ( 1 0 0 )T and ( 0 1 0 )T in spanning ³.

    However, we are still left with a choice of exactly which k among all those satisfying 3 ≠ 0 we choose; we can equally well add ( 1 1 1 )T, ( 0 0 1 )T, ( 1 1 – 4 )T, etc. Given this indeterminacy, the choice is ours; any one of these vectors will do just fine. It is often useful, but not algebraically essential, to choose mutually orthogonal basis vectors so that the information contained in one is entirely absent from the others. With ( 1 0 0 )T and ( 0 1 0 )T already chosen, the vector orthogonal to both must satisfy

    Enjoying the preview?
    Page 1 of 1