Skip to content

The Book of Why

Metadata

Highlights

we collect data only after we posit the causal model, after we state the scientific query we wish to answer, and after we derive the estimand. — location: 309 ^ref-5062


information about the effects of actions or interventions is simply not available in raw data, unless it is collected by controlled experimental manipulation. — location: 315 ^ref-41423


any query about the mechanism by which causes transmit their effects—the most prototypical “Why?” question—is actually a counterfactual question in disguise. — location: 321 ^ref-42157


That’s all that a deep-learning program can do: fit a function to data. — location: 332 ^ref-45068


human intuition is grounded in causal, not statistical, logic. — location: 364 ^ref-42273


These patterns are called back-door adjustment, front-door adjustment, and instrumental variables, the workhorses of causal inference in practice. — location: 369 ^ref-24562


“We think of a cause as something that makes a difference, and the difference it makes must be a difference from what would have happened without it.” — location: 377 ^ref-9655


you are smarter than your data. Data do not understand causes and effects; humans do. — location: 392 ^ref-46601


this story bore the cultural footprints of the actual process by which Homo sapiens gained dominion over our planet. — location: 407 ^ref-43453


It is useless to ask for the causes of things unless you can imagine their consequences. — location: 435 ^ref-11174


you cannot claim that Eve caused you to eat from the tree unless you can imagine a world in which, counter to facts, she did not hand you the apple. — location: 436 ^ref-33085


The first rung of the ladder calls for predictions based on passive observations. It is characterized by the question “What if I see …?” — location: 483 ^ref-61934


The goal of strong AI is to produce machines with humanlike intelligence, able to converse with and guide humans. Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality. — location: 506 ^ref-60588


the defining query of the second rung of the Ladder of Causation is “What if we do…?” What will happen if we change the environment? — location: 537 ^ref-61403


Finding out why a blunder occurred allows us to take the right corrective measures in the future. Finding out why a treatment worked on some people and not on others can lead to a new cure for a disease. — location: 565 ^ref-51503


The advantage we gained from imagining counterfactuals was the same then as it is today: flexibility, the ability to reflect and improve on past actions, and, perhaps even more significant, our willingness to take responsibility for past and current actions. — location: 587 ^ref-23792


One major contribution of AI to the study of cognition has been the paradigm “Representation first, acquisition second.” — location: 624 ^ref-35417


making an event happen means that you emancipate it from all other influences and subject it to one and only one influence—that which enforces its happening. — location: 668 ^ref-63212


Very often the structure of the diagram itself enables us to estimate all sorts of causal and counterfactual relationships: simple or complicated, deterministic or probabilistic, linear or nonlinear. — location: 731 ^ref-34586


human intuition is organized around causal, not statistical, relations. — location: 752 ^ref-29189


Probabilities, as given by expressions like P(Y | X), lie on the first rung of the Ladder of Causation and cannot ever (by themselves) answer queries on the second or third rung. Any attempt to “define” causation in terms of seemingly simpler, first-rung concepts must fail. — location: 768 ^ref-63114


definitions demand reduction, and reduction demands going to a lower rung. — location: 771 ^ref-31136


confounding too is a causal concept and hence defies probabilistic formulation. — location: 787 ^ref-55254


the notion of probability raising cannot be expressed in terms of probabilities. — location: 796 ^ref-57924


while probabilities encode our beliefs about a static world, causality tells us whether and how probabilities change when the world changes, be it by intervention or by act of imagination. — location: 829 ^ref-39370


the scatter plot has a roughly elliptical shape—a fact that was crucial to Galton’s analysis and characteristic of bell-shaped distributions with two variables. — location: 944 ^ref-54656


the slope of the best-fit line always enjoys the same properties. It equals 1 only when one quantity can predict the other precisely; it is 0 whenever the prediction is no better than a random guess. The slope (after scaling) is the same no matter whether you plot X against Y or Y against X. In other words, the slope is completely agnostic as to cause and effect. One variable could cause the other, or they could both be effects of a third cause; for the purpose of prediction, it does not matter. — location: 972 ^ref-48349


“Correlation does not imply causation” should give way to “Some correlations do imply causation.” — location: 1169 ^ref-63359


The realization that you cannot even tell A B C apart from A B C from data alone was a painful frustration. — location: 2022 ^ref-17304


Collider-induced correlations — location: 2990 ^ref-46743


that Reichenbach’s dictum was too strong, because it neglects to account for the process by which observations are selected. — location: 3000 ^ref-61700