Lecture 8
RN 13.2 · PM 8.3
Today: generalize the 3 structures to arbitrary paths using d-separation.
If \(E\) d-separates \(X\) and \(Y\), then \(X\) and \(Y\) are conditionally independent given \(E\).
A single open path is enough to break independence. We have to check every path and ask: is it blocked?
A path through middle node \(B\) is blocked iff:
Blocked when \(B \in E\)
(middle is observed)
Blocked when \(B \in E\)
(middle is observed)
Reversed: blocked when neither \(B\) nor any of its descendants are in \(E\)
Q1. Are TravelSubway and HighTemp independent?
Q2. Are TravelSubway and HighTemp independent given Flu?
Q3. Are Aches and HighTemp independent?
Q4. Are Aches and HighTemp independent given Flu?
Q5. Are Flu and ExoticTrip independent?
Q6. Independent given HighTemp?
A BN is correct if every independence it claims also holds in the true distribution.
So prefer the BN with the fewest edges (fewest probabilities to store).
Ordering matters: a bad order can force every later variable to depend on many earlier ones.
Original BN:
Set { } — add \(W\) first. Set { W } — is \(A\) dependent on \(W\)? Yes (chain in original). Add edge \(W \to A\). Set { W, A } — \(B\) and \(W\) independent given \(A\)? Yes (chain blocked). So \(A\) is \(B\)'s only parent.
Final: \(W \to A \to B\). Same shape as the original (2 edges).
Original BN:
Set { } — add \(A\) first. Set { A } — \(W\) depends on \(A\). Add \(A \to W\). Set { A, W } — \(B\) and \(W\) independent given \(A\)? Yes. \(B\) and \(A\) independent given \(W\)? No. So only \(A\) is \(B\)'s parent.
Final: \(A \to W\) and \(A \to B\). Different shape, still 2 edges — different but equally compact.
Original BN:
Set { } — add \(W\). Set { W } — \(G\) and \(W\) dependent (shared cause \(A\) is hidden). Add \(W \to G\). Set { W, G } — \(A\) is the parent of both. Both \(W\) and \(G\) must be parents of \(A\).
Final: 3 edges (\(W \to G, W \to A, G \to A\)) — more than the original's 2 edges. Suboptimal!
Original BN:
Set { } — add \(A\). Set { A } — \(B\) depends on \(A\) (direct neighbours). Add \(A \to B\). Set { A, B } — \(E\) and \(B\) NOT independent given \(A\) (v-structure middle observed!). \(E\) and \(A\) always dependent. Both are \(E\)'s parents.
Final: 3 edges (\(A \to B, A \to E, B \to E\)) — more than the original's 2. Reversed v-structures are expensive.
Adding effects before causes forces every later node to depend on every earlier one.
\(1 + 2 + 4 + 8 + 16 + 2 = \mathbf{33}\) probabilities — vs 12 with the causal order.
Causes precede effects. Add root causes first; effects last.
| Example | Original | Order | Reconstructed | Result |
|---|---|---|---|---|
| Ex 1 | \(B \to A \to W\) | \(W, A, B\) | \(W \to A \to B\) | 2 edges, same shape |
| Ex 1 alt | \(B \to A \to W\) | \(A, W, B\) | \(A \to W, A \to B\) | 2 edges, different shape |
| Ex 2 | \(A \to W, A \to G\) | \(W, G, A\) | \(W \to G, W \to A, G \to A\) | 3 edges — worse |
| Ex 3 | \(E \to A, B \to A\) | \(A, B, E\) | \(A \to B, A \to E, B \to E\) | 3 edges — worse |
Finding the most compact BN is NP-hard in general — but a causal ordering is a good heuristic.
Two variables can be highly correlated without one causing the other.
Intervention severs the incoming edges of \(S\) — the confounder can no longer reach \(S\).
The average treatment effect averages the effect over the confounder \(A\):
\(\mathrm{ATE} = \sum_A P(R \mid S{=}1, A)\, P(A) - \sum_A P(R \mid S{=}0, A)\, P(A)\)
\(\mathrm{ATE} \approx 0\)
Shoe size doesn't cause reading skill — randomised experiments confirm it.
We can now reason under uncertainty. Next, we combine these probabilities with utilities to act — choose the action that maximizes expected reward.