You are on page 1of 17

Testing recursive path models with correlated

errors using d-separation

Bill Shipley

Département de biologie

Université de Sherbrooke

Sherbrooke (Qc) J1K 2R1

CANADA

bshipley@courrier.usherb.ca

(819) 821-8000 ext. 2079

(819) 821-8049 (FAX)

6 June 2002

Structural Equation Modeling (2003) 10(2):214-221.

Running Head: d-sep test for path models with correlated errors

1
ABSTRACT

This paper shows how to extend the inferential test of Shipley (2000b),

that is applicable to recursive path models without correlated errors (a DAG

model), to a class of recursive path models that include correlated errors (a semi-

Markov model). The path model is first converted to a partial ancestral graph

(PAG) and then, for PAGs that do not require latent variables, an inducing path

DAG is obtained that is equivalent in its conditional independence relationships

to the original path model. The null probabilities of the k tests of independence

that are implied by this DAG are combined using Fisher’s test statistic C=-

2∑Ln(pi) which is distributed as a chi-square variate with 2k degrees of freedom.

2
INTRODUCTION

Shipley (2000a,b) introduced a new inferential test for recursive path

models without correlated errors. This test has a number of advantages over the

classical tests based on maximum likelihood estimation of a model covariance

matrix. First, the new test is exact rather than asymptotic and therefore permits

tests using small data sets. Second, being based on tests of (conditional)

independence, many different types of variables, functional relationships and

probability density functions can be accommodated. Third, the model test can be

decomposed into individual tests of elementary predictions of the underlying

causal hypothesis; this “local” property allows the researcher to identify those

parts of a poorly-fitting model that are contributing to the lack of fit. Here, I

describe the conditions under which the test can be extended to path models

with correlated errors and how this extension is performed.

The test in Shipley (2000b) is based on two notions derived from the

theory of directed acyclic graphs (DAGs): d-separation and basis sets of d-

separation relationships. D-separation is a manipulation of a DAG that separates

the directed paths between any two variables (vi, vj) in the DAG given a set Q of

other variables. In this paper I will use the notation (vi| Q| vj) to mean “variable vi

is d-separated from variable vj given the set of variables Q” and say that the set

Q “blocks” the directed paths linking vi and vj in the DAG. It can be proven that if

any two variables in a DAG are d-separated then, in any data generated by the

causal process represented by the DAG, the variables will be probabilistically

3
independent upon conditioning on the variables that are blocked (Pearl 1988;

Geiger, Verma and Pearl 1990; Geiger, Paz and Pearl 1991; Geiger and Pearl

1993; Pearl 2000). This does not require assumptions of linearity of the

functional relationships nor of any particular multivariate distribution of the

variables. Two variables (vi, vj) in a DAG (a Markov model) or in a semi-Markov

graph, for instance a path diagram with correlated errors, are d-separated given

a set Q of other variables in the graph if and only if there is no undirected path U

between vi and vj such that (i) every non-colliding variable along U is not a

member of Q and (ii) every colliding variable along U is either a member of Q or

else has a causal descendent that is a member of Q. A variable X is said to

“collide” along an undirected path U if it has arrowheads pointing into it from both

directions, thus: WÆXÅY. An extended discussion of d-separation is found in

Shipley (2000a).

The second notion is a “basis set” of d-separation statements. Let S be

the set of d-separation relationships (and therefore independence claims) that

are implied by a DAG. A basis B for S is a set of d-separation claims that

implies, using the laws of probability and the axioms of d-separation, all other

elements in S and no proper subset of B sustains such implications (Pearl 1988;

Verma and Pearl 1988; Pearl 2000). This means that a test of the

independencies in the basis set is necessary and sufficient to test all

independencies implied in the DAG.

The basis set used in Shipley (2000a,b) is the following (Pi is the set of

causal parents of variable vi in the DAG): BU={vi|Pi∪Pj|vj}. This basis set has the

4
additional property that its elements imply mutually independent residuals of vi

and of vj, conditional on Pi∪Pj, and therefore mutually independent tests of

independence (Shipley 2000b). Other basis sets are also possible (Pearl 2000)

but are not appropriate for the inferential test discussed here. An exact

inferential test for such a DAG (i.e. a path model without correlated errors) is

obtained by first calculating the probability pj associated with each of the k

independence claims in the basis set BU and then calculating the statistic

k
C = −2∑ Ln( p j ) . This statistic is distributed as a chi-squared variate with 2k
j =1

degrees of freedom if all of the k independence claims are true in the statistical

population. Shipley (2000a) provides more details concerning this test.

Path models with correlated errors

In graphical terms a recursive path model with correlated errors is not a

DAG and therefore the test proposed by Shipley (2000b) cannot be used. This is

because the basis sets for DAGs are not, in general, basis sets for path models

with correlated errors. I had suggested in Shipley (2000b) a way of extending the

test. This extension consisted of constructing an augmented DAG by replacing

each correlated error – represented by a bi-directional arc – by a new latent

variable which is the causal parent of the two variables possessing the correlated

errors, constructing the basis set from this augmented DAG, and then testing

only those d-separation claims in the basis set that do not involve latent

variables. Unfortunately this way of testing a path model with correlated errors is

5
of limited value because there can be d-separation claims involving only

observed variables that are implied by the model but which are neither in BU nor

which are implied by some combination of those d-separation claims in BU that

do not involve latent variables. Therefore the extension suggested in Shipley

(2000b) is a necessary, but not sufficient, test of the conditional independence

claims of a semi-Markov model. The extension proposed in this paper requires a

few more definitions.

Unshielded colliders

Given a triplet of variables (X,Y,Z) in a graph, if there is an edge between

X and Y, an edge between Y and Z, no edge between X and Z, then call this an

“unshielded pattern”. If, in this unshielded pattern, there are arrowheads pointing

into Y from both X and Z then call Y an “unshielded collider”. This is shown

graphically as X•ÆYÅ•Z where the circle (•) means that there can be an

arrowhead or not in that position. Note that Spirtes et al. (1993; 2000) use an

open circle (o) to represent the same information. If, in this unshielded pattern,

there are not arrowheads pointing into Y from both X and Z then call Y a “definite

non-collider”. This is shown graphically as X•—•Y•—•Z.

An augmented DAG

An augmented DAG (D’) of a path model with correlated errors is obtained

by replacing each correlated error – represented by a bi-directional arc – by a

new latent variable (lij) which is the causal parent of the two variables (vi, vj)

possessing the correlated errors. This augmented DAG will have two types of

variables: observed variables denoted by the set V and latent variables denoted

6
by the set L. This augmented DAG will have the same d-separation relationships

involving the original (observed) variables as the path model with correlated

errors (Spirtes et al. 1995). See Figure 1a.

Inducing paths

An inducing path between two observed variables (vi, vj) in the augmented

DAG D’ relative to the set of observed variables (V) exists if there is no subset

Q⊆ V\{vi,vj}, including the null subset, such that vi and vj are d-separated given Q.

In a DAG, two variables that are non-adjacent are necessarily d-separated given

some subset of the remaining variables. This is not true in general for semi-

Markov models. An example is seen in Figure 1b. Variables X1 and X4 in Figure

1b are not adjacent. Although they are d-separated given {X3, l24}, they are not

d-separated given any subset of the remaining observed variables {X2, X3}.

Using the null subset there is an open path X1ÆX2ÆX3ÆX4. Using either {X2},

{X3} or {X2, X3} the pair (X1, X4) are not d-separated because there exists the

undirected path X1ÆX2Ål24ÆX4 which is activated when conditioned on either X2

and/or its descendent X3. There is therefore an inducing path between X1 and X4

in the extended DAG D’ over the observed variables. In general, two non-

adjacent observed variables (vi, vj) will have an inducing path between them

relative to V\vi,vj if there exists an undirected path U between them such that all

observed variables on U are colliders and are ancestors of either vi or vj.

Partial ancestral graphs

Spirtes et al. (1993; 2000) introduced the notion of a PAG, or partial

ancestral graph. Spirtes et al. (1993) previously called the same thing is a

7
“partially oriented inducing path graph” and Desjardins (1999) has called it a

marginal dependency graph. Consider all those DAGs (M) containing the same

set V of observed variables, different sets of latent variables, but that imply the

same d-separation relationships between the set V of observed variables. A

PAG is a graphical construct involving only V such that (i) two variables (vi, vj)

have an edge between them if every DAG in M has an inducing path between vi

and vj relative to V\{vi, vj} and (ii) each unshielded pattern that collides at Y in

every DAG in M also collides in the PAG and (iii) every unshielded pattern that is

a definite non-collider at Y in every DAG in M is also a definite non-collider in the

PAG. There are some other orientation rules that can be applied to the PAG but

these aren’t necessary for the purposes of this paper. In other words, every

faithful acyclic model that has the same conditional independence relationships

has the same PAG. The construction of a PAG is given in the Causal Inference

Algorithm.

Constructing the PAG

The causal inference (CI) algorithm is given on page 183 of Spirtes et al.

(1993). Theorem 6.3 of that reference states that if the input to the CI algorithm,

involving the observed variables, is faithful to the generating graph, then the

output of the CI algorithm is a PAG. Since we assume, under the null hypothesis

that the path model with correlated errors is correct, this is always true by

assumption. The following steps will produce a PAG that is sufficient for our

purposes.

8
1. Given the path model with correlated edges (G), construct the

extended DAG (G’) by removing each double-headed arrow (vi↔vj)

and replacing it with a latent (lij) that is an exogenous common cause

of only vi and vj (viÅlijÆvj).

2. Construct an undirected graph (P) from G by removing all arrowheads

but retaining each edge and adding circles at the ends of each edge

(vi•—•vj).

3. For each pair of non-adjacent variables in G’ (vi, vj) having an inducing

path between them relative to the observed variables in G (i.e.

V\{vi,vj}), add the edge: (vi•—•vj) to P. Call the resulting graph P, at

this step, an undirected dependency graph.

4. For each unshielded pattern in P involving a triplet of variables (X, Y,

Z) orient Y as either a definite non-collider (X•—•Y•—•Z) or an

unshielded collider (X•ÆYÅ•Z) based on the d-separation

relationships in G’.

5. Orient the remaining edges in P such a way that all definite non-

colliders and unshielded colliders are respected, no new unshielded

colliders are formed, and no cycles are formed. Verma and Pearl

(1992) gave four rules for obtaining a maximally orienting P and

theorems 37 and 38 of Meek (1998) prove that they are sound (i.e. any

orientation other that that specified by these rules would lead to either

a new unshielded collider or to a directed cycle.

9
The result of step 4 is a PAG. Step 5 simply generates one of the

equivalent acyclic graphs represented by the PAG; Desjardins (1999) calls

this an “inducing path graph”. Because the PAG, and therefore the

resulting inducing path graph (P), are equivalent in their d-separation

relationships to the original path model with correlated errors (G), every d-

separation relationship in the inducing path graph exists in G and there is

no d-separation relationship in G that is not also implied by the inducing

path graph. If P can be oriented in such a way that it is a DAG then one

can test it using the inferential test of Shipley (2000b). I will call inducing

path graphs that are also DAGs “inducing path DAGs”. Figure 2 shows

the steps involved. Note that since the path models in Figure 2a,b are d-

separation equivalent to an inducing path DAG, one can obtain a basis set

and use the inferential test of Shipley (2000b).

On the other hand there is no inducing path DAG that is d-separation

equivalent to the path model in Figure 2c. Although one can still obtain d-

separation relationships implied by this non-DAG inducing path, and

therefore of the original path model, these d-separation statements are not

necessarily a basis; the only possibility is to conduct an approximate test

by using a Bonferonni correction to the significance level used in each of

the tests of independence.

Conclusions

10
When testing path models without correlated errors (i.e. DAG models), the

inferential test of Shipley (2000b) is superior to classical SEM based on

maximum likelihood for the reasons given in the Introduction. In such

models all constraints on the covariance matrix are independence

constraints and are therefore predicted by d-separation (Pearl 2000).

This extension to the inferential test of Shipley (2000b) that is proposed

here is not always superior to classical SEM when applied to path models

with correlated errors. First, some such path models are not d-separation

equivalent to any inducing path DAG, thus precluding the exact test.

Second, path models with correlated errors can imply constraints on the

covariance matrix that are equalities between functions of covariances

rather than conditional independence constraints. Therefore, when the

assumptions of classical SEM are met, such a test is therefore more

powerful. Desjardins (1999) contains results that may point to a way of

testing such constraints without resorting to maximum likelihood

estimation. There are conditions in which Shipley’s (2000b) test would still

be preferable to classical SEM even in the case of path models with

correlated errors, assuming that an inducing path DAG exists. First, non-

normal data and non-linear relationships can be accommodated (Shipley

2000a). Second, the test is exact and can therefore be used with small

samples. Third, there are models like the one in Figure 2b that are

unidentified and can therefore not be tested using classical SEM but that

can still be tested using the method presented here. Finally, if the model

11
is judged not to fit the data, the local property of the test can allow one to

determine which parts of the model are contributing to lack of fit. This

cannot be done using ML estimation since errors in one part of the model

are propagated throughout the rest of the model due to the global nature

of the method.

Acknowledgements

This research was financially supported by the Natural Sciences and Engineering

Research Council of Canada.

12
References

Desjardins, B. (1999). On the theoretical limits to reliable causal inference.

Faculty of Arts and Sciences. Pittsburg, University of Pittsburg, 161.

Geiger, D., Paz, A & Pearl, J. (1991). Axioms and algorithms for inferences

involving probabilistic independence. Information and Computation 91,

128-141.

Geiger, D. and Pearl, J. (1993). Logical and algorithmic properties of conditional

independence and graphical models. The annals of statistics 21, 2001-

2021.

Geiger, D., Verma, T. & Pearl, J. (1990). Identifying independence in Bayesian

Networks. Networks 20, 507-534.

Meek, C. (1998). Graphical models: Selecting causal and statistical models.

Department of Philosophy. Pittsburg, Carnegie Mellon University.

Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of

plausible inference. Morgan Kaufmann, San Mateo, CA.

Pearl, J. (2000). Causality. Cambridge University Press, Cambridge.

Scheines, R., Spirtes, P., Glymour, C, & Richardson, T. (1998). The TETRAD

project: Constraint based aids to causal model specification. Multivariate

Behavioral Research, 33, 65-117.

Shipley, B. (2000a). Cause and correlation in biology: A user's guide to path

analysis, structural equations, and causal inference. Oxford University

Press, Oxford.

13
Shipley, B. (2000b). A new inferential test for path models based on directed

acyclic graphs. Structural Equation Modeling 7, 206-218.

Spirtes, P., Glymour, C. & Scheines, R. (1993). Causation, Prediction, and

Search. Springer-Verlag, New York.

Spirtes, P., Glymour, C. & Scheines, R. (2000). Causation, prediction, and

search. MIT Press, Cambridge, Mass.

Spirtes, P., Richardson, T., Meek, A., Scheines, R. & Glymour, C. (1995). Using

D-separation to calculate zero partial correlations in linear models with

correlated errors. Technical Report CMU-Phil-72.

Verma, T. and Pearl, J. (1988). Causal networks: Semantics and

expressiveness. Pp. 69-76 in Schachter, T., Levitt, S. & Kanal, L.N.

(editors). Uncertainty in Artifical Intelligence Volume 4. Amsterdam:

Elsevier.

Verma, T. and Pearl, J. (1992). An algorithm for deciding if a set of observed

independencies has a causal explanation. Pp. 323-330 in Dubois, D.,

Wellman, M.P., D’Ambrosio, B. & Smets, P. (editors). Proceedings of the

8th Conference on Uncertainty in Artificial Intelligence. CA, Morgan

Kaufmann.

14
Figure 1. (a) A path model with correlated errors (i.e. a semi-Markov model). (b)

The augmented DAG for the path model. (c) A partial ancestral graph for

both the augmented DAG and the path model.

Figure 2. Three different path models with correlated errors (a, b and c) are

shown along with the undirected dependency graph, the partial ancestral

graph (PAG) and the inducing path acyclic graph. The d-separation

relationships shown at the bottom form a basis set for models a and b, but

not for c.

15
Figure 1.

(a) X1 X2 X3 X4

(b) X1 X2 X3 X4

L24

(c) X1 X2 X3 X4

16
Figure 2.

(a) (b) (c)

Path model X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4
l12 l34 l12 l34 l12 l23 l34
Extended DAG X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4

Undirected X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4
dependency graph

PAG X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4

IPG X1 X2 X3 X4 X1 X2 X3 X4 X1 X2 X3 X4

X1_||_X3|X2 X1_||_X3|X2 X1_||_X3|0


D-separation X1_||_X4|0 X1_||_X4|{X2,X3} X1_||_X4|0
relationships X2_||_X4|0 X2_||_X4|0

17

You might also like