The Book of Why¶

Metadata¶

Author: Judea Pearl and Dana Mackenzie
ASIN: B075CR9QBJ
ISBN: 046509760X
Reference: https://www.amazon.com/dp/B075CR9QBJ
Kindle link

Highlights¶

we collect data only after we posit the causal model, after we state the scientific query we wish to answer, and after we derive the estimand. — location: 309 ^ref-5062

information about the effects of actions or interventions is simply not available in raw data, unless it is collected by controlled experimental manipulation. — location: 315 ^ref-41423

any query about the mechanism by which causes transmit their effects—the most prototypical “Why?” question—is actually a counterfactual question in disguise. — location: 321 ^ref-42157

That’s all that a deep-learning program can do: fit a function to data. — location: 332 ^ref-45068

human intuition is grounded in causal, not statistical, logic. — location: 364 ^ref-42273

These patterns are called back-door adjustment, front-door adjustment, and instrumental variables, the workhorses of causal inference in practice. — location: 369 ^ref-24562

“We think of a cause as something that makes a difference, and the difference it makes must be a difference from what would have happened without it.” — location: 377 ^ref-9655

you are smarter than your data. Data do not understand causes and effects; humans do. — location: 392 ^ref-46601

this story bore the cultural footprints of the actual process by which Homo sapiens gained dominion over our planet. — location: 407 ^ref-43453

It is useless to ask for the causes of things unless you can imagine their consequences. — location: 435 ^ref-11174

you cannot claim that Eve caused you to eat from the tree unless you can imagine a world in which, counter to facts, she did not hand you the apple. — location: 436 ^ref-33085

The first rung of the ladder calls for predictions based on passive observations. It is characterized by the question “What if I see …?” — location: 483 ^ref-61934

The goal of strong AI is to produce machines with humanlike intelligence, able to converse with and guide humans. Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality. — location: 506 ^ref-60588

the defining query of the second rung of the Ladder of Causation is “What if we do…?” What will happen if we change the environment? — location: 537 ^ref-61403

Finding out why a blunder occurred allows us to take the right corrective measures in the future. Finding out why a treatment worked on some people and not on others can lead to a new cure for a disease. — location: 565 ^ref-51503

The advantage we gained from imagining counterfactuals was the same then as it is today: flexibility, the ability to reflect and improve on past actions, and, perhaps even more significant, our willingness to take responsibility for past and current actions. — location: 587 ^ref-23792

One major contribution of AI to the study of cognition has been the paradigm “Representation first, acquisition second.” — location: 624 ^ref-35417

making an event happen means that you emancipate it from all other influences and subject it to one and only one influence—that which enforces its happening. — location: 668 ^ref-63212

Very often the structure of the diagram itself enables us to estimate all sorts of causal and counterfactual relationships: simple or complicated, deterministic or probabilistic, linear or nonlinear. — location: 731 ^ref-34586

human intuition is organized around causal, not statistical, relations. — location: 752 ^ref-29189

Probabilities, as given by expressions like P(Y | X), lie on the first rung of the Ladder of Causation and cannot ever (by themselves) answer queries on the second or third rung. Any attempt to “define” causation in terms of seemingly simpler, first-rung concepts must fail. — location: 768 ^ref-63114

definitions demand reduction, and reduction demands going to a lower rung. — location: 771 ^ref-31136

confounding too is a causal concept and hence defies probabilistic formulation. — location: 787 ^ref-55254

the notion of probability raising cannot be expressed in terms of probabilities. — location: 796 ^ref-57924

while probabilities encode our beliefs about a static world, causality tells us whether and how probabilities change when the world changes, be it by intervention or by act of imagination. — location: 829 ^ref-39370

the scatter plot has a roughly elliptical shape—a fact that was crucial to Galton’s analysis and characteristic of bell-shaped distributions with two variables. — location: 944 ^ref-54656

the slope of the best-fit line always enjoys the same properties. It equals 1 only when one quantity can predict the other precisely; it is 0 whenever the prediction is no better than a random guess. The slope (after scaling) is the same no matter whether you plot X against Y or Y against X. In other words, the slope is completely agnostic as to cause and effect. One variable could cause the other, or they could both be effects of a third cause; for the purpose of prediction, it does not matter. — location: 972 ^ref-48349

“Correlation does not imply causation” should give way to “Some correlations do imply causation.” — location: 1169 ^ref-63359

The realization that you cannot even tell A B C apart from A B C from data alone was a painful frustration. — location: 2022 ^ref-17304

Collider-induced correlations — location: 2990 ^ref-46743

that Reichenbach’s dictum was too strong, because it neglects to account for the process by which observations are selected. — location: 3000 ^ref-61700