Identification and Estimation of Causal Effects

Lorenzo Fabbri

July 5, 2024

Contents

1 Resources
2 Time-invariant Exposures
 2.1 Deterministic Treatment Regimes
  2.1.1 Deterministic Static Treatment Regimes
  2.1.2 Deterministic Dynamic Treatment Regimes
  2.1.3 Deterministic Natural Treatment Regimes
  2.1.4 Modified Treatment Policies
 2.2 Random Treatment Regimes
3 Time-varying Exposures
 3.1 Deterministic Treatment Regimes
  3.1.1 Deterministic Static Treatment Regimes
  3.1.2 Deterministic Dynamic Treatment Regimes
  3.1.3 Deterministic Natural Treatment Regimes
  3.1.4 Modified Treatment Policies
 3.2 Random Treatment Regimes
  3.2.1 Random Static Treatment Regimes
  3.2.2 Random Dynamic Treatment Regimes
  3.2.3 Random Natural Treatment Regimes
  3.2.4 Modified Treatment Policies

1 Resources

2 Time-invariant Exposures

2.1 Deterministic Treatment Regimes

The rule for assigning treatment does so with probability 1.

2.1.1 Deterministic Static Treatment Regimes

The rule for assigning treatment does not depend on past treatment or covariates.

LA|aYa

Figure 1: A SWIG representing a static treatment regime.

The joint density is:

f (y, l,a ) = f (y|l,a)f (a |l)f(l).
(1)

After intervening on the exposure A, we have:

  G
f  (y,l) = f(y|l,a)f(l).
(2)

Thus, the expected value of the outcome Y is:

𝔼G[Y ] = yyfG(y) (3)
= yy lfG(y,l) (4)
= y lyf(y|l,a)f(l) (5)
= l𝔼[Y |A = a,L  = l] f(L = l). (6)

Algorithms One estimator of 𝔼G[Y ] is called parametric g-computation formula, and is based on an outcome model alone. For the simple case of a deterministic static intervention with one exposure and a single time point, the pseudo-algorithm reads as follows:

  1. Fit a regression model with dependent variable Y and independent variables A and L.
  2. Estimate the outcome Y a using the model fit in the previous point but changing the exposure according to the intervention rule G.
  3. Take the average of Ŷa over the confounders L.

In the case of multiple exposures (e.g., if A is actually a vector of variables), the g-formula would remain the same 1 , but the pseudo-algorithm should be modified to take into account that the intervention rule now applies to all exposures a A.

A second estimator of 𝔼G[Y ] is based on modeling the exposure mechanism, rather than the outcome, and it is referred to as inverse probability of treatment weighting (IPTW) estimator. This can be derived nothing that the g-formula for 𝔼G[Y ] can be rewritten as follows:

𝔼G[Y ] = y lyf(y|l,a)f(l) (7)
= y lyf(y,a|l)-f(l)-
f(a|l) (8)
= l𝔼[          ]
 --Y-A--|L
 f (A |L)f(l) (9)
= 𝔼[   YA   ]
  -------
  f(A |L ). (10)

The pseudo-algorithm for a binary exposure A reads as follows (for simplicity, we do not consider the censoring mechanism C here):

  1. Fit a model with dependent variable A and independent variables L.
  2. Denoting the predictions from this model as pa, estimate the weights wa = --------1--------
A×pa+(1−A)×(1−pa).
  3. Take the average of wa × Y over the confounders L.
2.1.2 Deterministic Dynamic Treatment Regimes

The rule for assigning treatment depends on past treatment or covariates.

LA|agYg

Figure 2: A SWIG representing a dynamic treatment regime.

The joint density is:

         g           g    g
f(y,l,a,a ) = f (y |l,a )f(a |l)f (a|l)f(l).
(11)

After intervening on the exposure A, we have:

  G      g           g    g
f  (y,l,a ) = f(y|l,a )f (a |l)f(l).
(12)

Thus, the expected value of the outcome Y is:

𝔼G[Y ] = yyfG(y) (13)
= yy l agfG(y,l,ag) (14)
= y l agyf(y|l,ag)f(ag|l)f(l) (15)
= l ag𝔼         g
[Y |A = a ,L  = l] f(ag|L = l)f(L = l). (16)
2.1.3 Deterministic Natural Treatment Regimes

The rule for assigning treatment depends on its natural value.

LA−−→ag Yg

Figure 3: A SWIG representing a natural treatment regime.

The joint density is:

         g           g    g
f(y,l,a,a ) = f(y |l,a )f(a |l,a)f(a|l)f (l).
(17)

After intervening on the exposure A, we have:

  G        g           g    g
f  (y,l,a,a ) = f(y|l,a )f(a |l,a)f (a |l)f(l).
(18)

Thus, the expected value of the outcome Y is:

𝔼G[Y ] = yyfG(y) (19)
= yy l a agfG(y,l,a,ag) (20)
= y l a agyf(y|l,ag)f(ag|l,a)f(a|l)f(l) (21)
= l a ag𝔼         g
[Y |A  = a ,L =  l] f(ag|a,L = l)f(a|l)f(L = l). (22)

Algorithms One estimator of 𝔼G[Y ] is the parametric g-computation formula. For the simple case of a deterministic intervention that depends on the natural value of a single exposure and a single time point, it suffices to notice that Equation 22 is equivalent to:

𝔼G[Y ] = l a ag𝔼         g
[Y |A  = a ,L =  l] f(ag,a,l). (23)

The pseudo-algorithm then reads as follows:

  1. Fit a regression model with dependent variable Y and independent variables A and L.
  2. Estimate the outcome Y a using the model fit in the previous point but changing the exposure according to the intervention rule G.
  3. Take the average of Ŷa over the confounders L.

In the case of multiple exposures (e.g., if A is actually a vector of variables), the g-formula would remain the same 2 , but the pseudo-algorithm should be modified to take into account that the intervention rule now applies to all exposures a A.

2.1.4 Modified Treatment Policies

2.2 Random Treatment Regimes

The rule for assigning treatment does so with probability between 0 and 1.

3 Time-varying Exposures

3.1 Deterministic Treatment Regimes

The rule for assigning treatment does so with probability 1.

3.1.1 Deterministic Static Treatment Regimes

The rule for assigning treatment does not depend on past treatment or covariates.

If fint(ak|āk1,Dk = 0) is either 0 or 1 for each āk and for k = 0,…,K. In particular, given the regime g = (g0,…,gK), fint(ak|āk1g,D k = 0) = 1 if ak = akg, and 0 otherwise, with a sg = g s(ās1g).

L0A0|a0 L1 a0A 1a0|a 1Ya0,a1

Figure 4: A SWIG representing a static treatment regime.

The joint density is:

f(y,l0,l1,a0,a1) = f(y|l0,l1,a0,a1)× (24)
f(a1|l0,l1,a0) × f(l1|l0,a0)× (25)
f(a0|l0) × f(l0). (26)

After intervening on the exposure A at both time points, we have:

fG(y,l 0,l1,a0,a1) = f(y|l0,l1,a0,a1)× (27)
f(a1|l0,l1,a0) × f(l1|l0,a0)× (28)
f(a0|l0) × f(l0). (29)

Thus, the expected value of the outcome Y is:

𝔼G[Y ] = yyfG(y) (30)
= yy l0 l1 a0 a1fG(y,l 0,l1,a0,a1) (31)
= l0 l1 a0 a1 𝔼[Y |L0 = l0,L1 = l1,A0 = a0, A1 = a1] × f(a1|l0,l1,a0)× f(l1|l0,a0)× f(a0|l0)× f(l0). (32)

Algorithms One estimator of 𝔼G[Y ] is the parametric g-computation formula. For the case of a deterministic static intervention with one exposure and two time points, we can rewrite Equation 32 so that it corresponds to a series of conditional expectations:

𝔼G[Y ] = l0 l1 a0 a1 𝔼[Y |L0 = l0,L1 = l1,A0 = a0, A1 = a1] × f(a1|l0,l1,a0) × f(l1|l0,a0)× f(a0|l0) × f(l0) (33)
= l0 l1 a0 a1 𝔼[Y |L0 = l0,L1 = l1,A0 = a0, A1 = a1] × f(a1,l1|l0,a0)× f(a0,l0). (34)

Equation 34 suggests a different form for the parametric g-computation formula, which in the literature is usually called iterated conditional expectation (ICE) g-computation formula. The pseudo-algorithm then reads as follows (Āt means the history of A up to time t):

  1. Fit a regression model with dependent variable Y and independent variables Ā1 and L1.
  2. Estimate the outcome Y ā1 using the model fit in the previous point but changing the exposure A1 according to the intervention rule G.
  3. Fit a regression model with dependent variable Ŷā1 and independent variables A 0 and L0.
  4. Estimate the outcome Y ā0 using the model fit in the previous point but changing the exposure A0 according to the intervention rule G.
  5. Take the average of Ŷā0 over the confounders L 0.

In the case of more than two time points, simply repeat the steps above until reaching t = 0. The ICE g-computation formula is appealing because it does not require the specification of models for the confounders at each time point.

In the case of multiple exposures (e.g., if At is actually a vector of variables), the g-formula would remain the same 3 , but the pseudo-algorithm should be modified to take into account that the intervention rule now applies to all exposures at At.

3.1.2 Deterministic Dynamic Treatment Regimes

The rule for assigning treatment depends on past treatment or covariates.

If fint(ak|lkk1,Dk = 0) is either 0 or 1 for each (āk,lk) and for k = 0,…,K. In particular, given the regime g = (g0,…,gK), fint(ak|lkk1g,D k = 0) = 1 if ak = akg, and 0 otherwise, with asg = g s(lss1g).

3.1.3 Deterministic Natural Treatment Regimes

The rule for assigning treatment depends on its natural value.

L0A0−−→A0 +gL 1A1−−→A1 +gY

Figure 5: A SWIG representing a natural treatment regime.

The joint density is:

f(y,l0,l1,a0,a0g,a 1,a1g) = f(y|l 0,l1,a0g,a 1g)× (35)
f(a1g|l 0,l1,a0g,a 1)× (36)
f(a1|l0,l1,a0g)× (37)
f(l1|l0,a0g)× (38)
f(a0g|l 0,a0)× (39)
f(a0|l0)f(l0). (40)

After intervening on the exposure A, we have:

fG(y,l 0,l1,a0,a0g,a 1,a1g) = f(y|l 0,l1,a0g,a 1g)× (41)
f(a1g|l 0,l1,a0g,a 1)× (42)
f(a1|l0,l1,a0g)× (43)
f(l1|l0,a0g)× (44)
f(a0g|l 0,a0)× (45)
f(a0|l0)f(l0). (46)

Thus, the expected value of the outcome Y is:

𝔼G[Y ] = yyfG(y) (47)
= yy l0 l1 a0 a0g a1 a1gfG(y,l 0,l1,a0,a0g,a 1,a1g) (48)
= l0 l1 a0 a0g a1 a1g 𝔼[Y |L0 = l0,L1 = l1,A0 = ag0,A1  = ag1] × f(a1g|l 0,l1,a0g,a 1)× f(a1|l0,l1,a0g)× f(l1|l0,a0g)× f(a0g|l 0,a0)× f(a0|l0)f(l0). (49)
3.1.4 Modified Treatment Policies

3.2 Random Treatment Regimes

The rule for assigning treatment does so with probability between 0 and 1.

3.2.1 Random Static Treatment Regimes

The rule for assigning treatment does not depend on past covariates.

3.2.2 Random Dynamic Treatment Regimes

The rule for assigning treatment depends on past covariates.

3.2.3 Random Natural Treatment Regimes

The rule for assigning treatment depends on its natural value.

3.2.4 Modified Treatment Policies