AP Statistics Cheat Sheet 2026
Descriptive Statistics
| Concept | Formula / Rule |
|---|---|
| Mean | $\bar{x} = \frac{\sum x_i}{n}$ |
| Median | Middle value when sorted; resistant to outliers |
| Standard deviation (sample) | $s = \sqrt{\frac{\sum(x_i - \bar{x})^2}{n-1}}$ |
| IQR | $Q_3 - Q_1$; resistant measure of spread |
| Outlier rule (boxplot) | Below $Q_1 - 1.5\cdot\text{IQR}$ or above $Q_3 + 1.5\cdot\text{IQR}$ |
| z-score | $z = \frac{x - \mu}{\sigma}$; tells how many SDs from mean |
| Percentile | % of data at or below that value |
Regression
| Concept | Formula / Rule |
|---|---|
| LSRL equation | $\hat{y} = a + bx$ |
| Slope | $b = r \cdot \frac{s_y}{s_x}$ |
| Intercept | $a = \bar{y} - b\bar{x}$ |
| Correlation $r$ | −1 ≤ r ≤ 1; direction + strength of linear relationship |
| $r^2$ | % of variation in $y$ explained by the linear model |
| Residual | $y - \hat{y}$ (actual minus predicted) |
| Residual plot | Should show random scatter — no pattern means linear model is appropriate |
Probability
| Rule | Formula |
|---|---|
| Addition rule (any events) | $P(A \cup B) = P(A) + P(B) - P(A \cap B)$ |
| Addition rule (mutually exclusive) | $P(A \cup B) = P(A) + P(B)$ |
| Multiplication rule (any) | $P(A \cap B) = P(A) \cdot P(B|A)$ |
| Independent events | $P(A \cap B) = P(A) \cdot P(B)$; knowing A doesn't change P(B) |
| Conditional probability | $P(A|B) = \frac{P(A \cap B)}{P(B)}$ |
| Expected value | $E(X) = \sum x_i \cdot P(x_i)$ |
| Variance of X + Y (independent) | $\sigma^2_{X+Y} = \sigma^2_X + \sigma^2_Y$ |
Distributions
Binomial Distribution
Conditions: Fixed $n$, two outcomes, constant $p$, independent trials (BINS).
$P(X = k) = \binom{n}{k}p^k(1-p)^{n-k}$, $\mu = np$, $\sigma = \sqrt{np(1-p)}$
Geometric Distribution
Number of trials until first success. $P(X = k) = (1-p)^{k-1}p$, $\mu = 1/p$
Normal Distribution
Symmetric, bell-shaped. Fully described by $\mu$ and $\sigma$. Use $z = (x-\mu)/\sigma$ then normalcdf on calculator.
Empirical rule: 68% within 1σ, 95% within 2σ, 99.7% within 3σ.
Sampling Distributions
| Statistic | Mean | Std Dev (Std Error) | Normal when |
|---|---|---|---|
| $\bar{x}$ (sample mean) | $\mu$ | $\sigma/\sqrt{n}$ | $n \geq 30$ or population normal (CLT) |
| $\hat{p}$ (sample proportion) | $p$ | $\sqrt{p(1-p)/n}$ | $np \geq 10$ and $n(1-p) \geq 10$ |
| $\bar{x}_1 - \bar{x}_2$ | $\mu_1 - \mu_2$ | $\sqrt{\sigma_1^2/n_1 + \sigma_2^2/n_2}$ | Both samples normal or $n \geq 30$ |
| $\hat{p}_1 - \hat{p}_2$ | $p_1 - p_2$ | $\sqrt{p_1(1-p_1)/n_1 + p_2(1-p_2)/n_2}$ | All four: $n_1p_1, n_1(1-p_1), n_2p_2, n_2(1-p_2) \geq 10$ |
All Inference Tests — Quick Reference
| Test | Use for | Test statistic | Calculator |
|---|---|---|---|
| 1-sample z-test for $p$ | One proportion vs. claim | $z = \frac{\hat{p}-p_0}{\sqrt{p_0(1-p_0)/n}}$ | 1-PropZTest |
| 2-sample z-test for $p_1-p_2$ | Two proportions | Use pooled $\hat{p}_c$ | 2-PropZTest |
| 1-sample t-test for $\mu$ | One mean vs. claim | $t = \frac{\bar{x}-\mu_0}{s/\sqrt{n}}$, df = $n-1$ | T-Test |
| 2-sample t-test for $\mu_1-\mu_2$ | Two independent means | $t = \frac{\bar{x}_1-\bar{x}_2}{\sqrt{s_1^2/n_1+s_2^2/n_2}}$ | 2-SampTTest |
| Paired t-test | Before/after, matched pairs | $t = \frac{\bar{d}}{s_d/\sqrt{n}}$, use differences | T-Test on diffs |
| Chi-square GOF | One categorical variable vs. claimed distribution | $\chi^2 = \sum\frac{(O-E)^2}{E}$, df = $k-1$ | $\chi^2$ GOF-Test |
| Chi-square homogeneity | Same distribution across groups? | Same formula, df = $(r-1)(c-1)$ | $\chi^2$-Test |
| Chi-square independence | Two variables associated? | Same formula, df = $(r-1)(c-1)$ | $\chi^2$-Test |
| t-test for slope | Linear relationship in population? | $t = b/SE_b$, df = $n-2$ | LinRegTTest |
The Three Conditions (Every Test)
- Random — random sample or randomized experiment
- Normal — for means: $n \geq 30$ or population normal; for proportions: $np \geq 10$ and $n(1-p) \geq 10$
- Independence (10% rule) — $n \leq 0.10 \cdot N$ (sample ≤ 10% of population)
Always state all three conditions explicitly on FRQ — missing one = point deducted.
Confidence Intervals
General form: $\text{statistic} \pm z^* \cdot \text{SE}$ or $\text{statistic} \pm t^* \cdot \text{SE}$
| Confidence level | z* |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
Interpreting CIs: "We are 95% confident that the true [parameter] is between [lower] and [upper]." Never say "95% probability" — the interval either contains the parameter or it doesn't.
Hypothesis Testing Framework
- State hypotheses: $H_0: p = p_0$ vs $H_a: p > p_0$ (or $<$ or $\neq$)
- Check conditions (Random, Normal, Independence)
- Calculate test statistic and p-value
- Make decision: If p-value $< \alpha$, reject $H_0$. If $\geq \alpha$, fail to reject $H_0$.
- Conclude in context: "There is/is not convincing evidence that [Ha in words]."