AP Statistics — 2026 Exam Reference Information

AP Statistics · 2026 Exam Reference

Formula & Table Reference

An interactive study tool — click any formula to see a plain-English explanation and worked example, and use the table calculators to look up critical values dynamically.

Click any formula row to expand · Use the table calculators below for z, t, and χ² lookups

I. Descriptive Statistics

Note: The actual exam reference sheet does not label individual formulas in this section — they appear without names.

Sample Mean	\(\displaystyle\bar{x}=\frac{\displaystyle\sum x_i}{n}\)	▼
What it means Add up all the data values and divide by how many there are. The mean is the balancing point of the distribution. It is sensitive to outliers — a single extreme value can pull it far from the center of the bulk of the data. Worked Example Scores: 72, 85, 90, 88, 75 (\(n=5\)) \(\bar{x}=\dfrac{72+85+90+88+75}{5}=\dfrac{410}{5}=\) 82
Sample Standard Deviation	\(\displaystyle s_x=\sqrt{\frac{\displaystyle\sum(x_i-\bar{x})^2}{n-1}}\)	▼
What it means Measures how spread out the data are around \(\bar{x}\). We divide by \(n-1\) rather than \(n\) to produce an unbiased estimate of the population standard deviation \(\sigma\). Larger \(s_x\) means more variability in the data. Worked Example Values: 2, 4, 4, 4, 5, 5, 7, 9 (\(n=8,\;\bar{x}=5\)) \(\sum(x_i-\bar{x})^2 = 9{+}1{+}1{+}1{+}0{+}0{+}4{+}16 = 32\) \(s_x = \sqrt{32/7}\approx\) 2.14
Least-Squares Regression Line	\(\hat{y} = a + bx\)	▼
What it means \(\hat{y}\) ("y-hat") is the predicted response for a given \(x\). The line minimizes the sum of squared vertical residuals \(\sum(y_i-\hat{y}_i)^2\). \(b\) is the slope and \(a\) is the \(y\)-intercept. Use \(\hat{y}\) only for interpolation within the range of the data. Worked Example Predicting test score from hours studied: \(\hat{y}=50+8x\) For \(x=4\) hours: \(\hat{y}=50+8(4)=\) 82 points
Slope of the LSRL	\(\displaystyle b=r\frac{s_y}{s_x}\)	▼
What it means The slope is the correlation coefficient scaled by the ratio of the standard deviations. For every 1-unit increase in \(x\), \(\hat{y}\) changes by \(b\) units. The sign of \(b\) always matches the sign of \(r\). Worked Example \(r=0.85,\;s_y=10,\;s_x=4\) \(b=0.85\times\tfrac{10}{4}=0.85\times 2.5=\) 2.125 Each extra unit of \(x\) predicts +2.125 units of \(\hat{y}\).
Correlation Coefficient	\(\displaystyle r=\frac{1}{n-1}\sum\!\left(\frac{x_i-\bar{x}}{s_x}\right)\!\!\left(\frac{y_i-\bar{y}}{s_y}\right)\)	▼
What it means Measures the strength and direction of a linear association. Always \(-1\le r\le 1\). Values near \(\pm 1\) are strong; near 0 are weak. \(r\) only measures linear association — a perfectly curved relationship can have \(r\approx 0\). Worked Example (conceptual) When tall people also tend to be heavier, both \(z\)-scores are positive together → their products are positive → \(r\) is positive. A calculator gives the exact value. Example result: \(r=+0.87\) (strong positive linear association).

II. Probability and Distributions

\(P(A\cup B)=P(A)+P(B)-P(A\cap B)\)	▼
What it means Probability that \(A\) or \(B\) (or both) occur. We subtract the overlap \(P(A\cap B)\) to avoid double-counting it. If \(A\) and \(B\) are mutually exclusive, \(P(A\cap B)=0\) and the rule simplifies to \(P(A)+P(B)\). Worked Example Standard deck: \(P(\heartsuit)=\tfrac{13}{52}\), \(P(\text{face})=\tfrac{12}{52}\), \(P(\heartsuit\cap\text{face})=\tfrac{3}{52}\) \(P(\heartsuit\cup\text{face})=\tfrac{13+12-3}{52}=\tfrac{22}{52}\approx\) 0.423
\(\displaystyle P(A\mid B)=\frac{P(A\cap B)}{P(B)}\)	▼
What it means The probability of \(A\) given that \(B\) has occurred. We restrict the sample space to outcomes where \(B\) happened, then find what fraction also include \(A\). Events are independent when \(P(A\mid B)=P(A)\). Worked Example \(P(\text{studied})=0.90\), \(P(\text{pass}\cap\text{studied})=0.63\) \(P(\text{pass}\mid\text{studied})=0.63/0.90=\) 0.70

Probability Distribution	Mean	Standard Deviation
Discrete random variable, \(X\) \(\mu_X=E(X)={\displaystyle\sum} x_i\cdot P(x_i)\)	\(\mu_X\)	\(\sigma_X=\sqrt{{\displaystyle\sum}(x_i-\mu_X)^2\cdot P(x_i)}\)	▼
Mean — what it means The expected value: a probability-weighted average of all possible outcomes. Multiply each value by its probability and sum. Think of it as the long-run average if the random process were repeated many, many times. Worked Example — Fair Die \(X=\) face value; each \(P(x)=\tfrac{1}{6}\) \(\mu_X=\tfrac{1}{6}(1{+}2{+}3{+}4{+}5{+}6)=\tfrac{21}{6}=\) 3.5 \(\sigma_X=\sqrt{\tfrac{1}{6}[(1{-}3.5)^2+\cdots+(6{-}3.5)^2]}\approx\) 1.71
If \(X\) has a binomial distribution with parameters \(n\) and \(p\), then: \(P(X=x)=\dbinom{n}{x}p^x(1-p)^{n-x}\) \(x=0,1,2,\ldots,n\)	\(np\)	\(\sqrt{np(1-p)}\)	▼
What it means Use when you have \(n\) independent trials, each with probability \(p\) of success, and you want \(P(\text{exactly }x\text{ successes})\). \(\binom{n}{x}\) counts the arrangements; \(p^x(1-p)^{n-x}\) is the probability of each arrangement. Conditions: Binary, Independent, Number fixed, Same \(p\). Worked Example 5 T/F questions; random guess (\(p=0.5\)). \(P(X=3)\)? \(\binom{5}{3}(0.5)^3(0.5)^2=10\times 0.125\times 0.25=\) 0.3125 \(\mu=5(0.5)=2.5\); \(\sigma=\sqrt{5(.5)(.5)}\approx 1.12\)
If \(X\) has a geometric distribution with parameter \(p\), then: \(P(X=x)=(1-p)^{x-1}p\) \(x=1,2,3,\ldots\)	\(\dfrac{1}{p}\)	\(\dfrac{\sqrt{1-p}}{p}\)	▼
What it means Use when you repeat independent trials until the first success. \(X\) = the trial number of the first success. There is no fixed \(n\) — you stop when you succeed. \((1-p)^{x-1}\) is the probability of \(x-1\) failures first. Worked Example Free-throw shooter, \(p=0.70\). \(P(\text{first make on 3rd attempt})\)? \(P(X=3)=(0.30)^2(0.70)=0.09\times 0.70=\) 0.063 On average: \(\mu=1/0.70\approx\) 1.43 attempts

III. Sampling Distributions and Inferential Statistics

Standardized Test Statistic	\(\displaystyle\text{test statistic}=\frac{\text{statistic}-\text{parameter}}{\text{standard error of the statistic}}\)	▼
What it means The skeleton of every significance test. Measures how many standard errors the sample statistic lies from the hypothesized parameter value. A large absolute value means the data are unlikely under \(H_0\), providing evidence against it. Worked Example \(H_0:\mu=100\). Sample: \(\bar{x}=107\), \(\text{SE}=3.5\) \(t=(107-100)/3.5=7/3.5=\) 2.0 The sample mean is 2 SEs above \(\mu_0=100\).
Confidence Interval	\(\text{statistic}\pm(\text{critical value})(\text{SE of statistic})\)	▼
What it means An interval of plausible values for the true parameter. The critical value (\(z^\) or \(t^\)) controls the confidence level. Common \(z^\) values: 1.645 (90%), 1.960 (95%), 2.576 (99%). The margin of error is \(\text{critical value}\times\text{SE}\). Worked Example — 95% CI for \(\mu\) \(\bar{x}=52,\;\text{SE}=2.4,\;z^=1.960\) \(52\pm 1.960(2.4)=52\pm 4.70=\) (47.30, 56.70) We are 95% confident the true mean is in this interval.
Chi-Square Statistic	\(\displaystyle\chi^2=\sum\frac{(\text{observed}-\text{expected})^2}{\text{expected}}\)	▼
What it means Measures how far observed counts deviate from expected counts under \(H_0\). Each cell contributes one term; larger \(\chi^2\) = more evidence against \(H_0\). Used for goodness-of-fit and tests of independence/homogeneity. Always \(\chi^2\ge 0\); the test is always right-tailed. Worked Example Fair die, 60 rolls. Expected: 10 per face. Observed: 8,12,9,11,7,13 \(\chi^2=\tfrac{4}{10}+\tfrac{4}{10}+\tfrac{1}{10}+\tfrac{1}{10}+\tfrac{9}{10}+\tfrac{9}{10}=\) 2.8 (df=5; fail to reject \(H_0\))

Sampling Distributions for Proportions

Random Variable	Parameters of Sampling Distribution	Standard Error* of Sample Statistic
One population: \(\hat{p}\)	\(\mu_{\hat{p}}=p\) \(\sigma_{\hat{p}}=\sqrt{\dfrac{p(1-p)}{n}}\)	\(s_{\hat{p}}=\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\)	▼
What it means For a significance test, use \(\sigma_{\hat{p}}\) with \(p_0\) from \(H_0\) (you're assuming \(p\) is known). For a confidence interval, use \(s_{\hat{p}}\) with \(\hat{p}\) in place of \(p\) (you don't know \(p\), so you estimate it). Worked Example — 95% CI \(n=400,\;\hat{p}=0.62\) \(s_{\hat{p}}=\sqrt{0.62(0.38)/400}\approx 0.0242\) \(0.62\pm 1.96(0.0242)=\) (0.572, 0.668)
Two populations: \(\hat{p}_1-\hat{p}_2\)	\(\mu=p_1-p_2\) \(\sigma=\sqrt{\dfrac{p_1(1-p_1)}{n_1}+\dfrac{p_2(1-p_2)}{n_2}}\)	CI: \(\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\) Test (\(p_1=p_2\) assumed): \(\sqrt{\hat{p}_c(1-\hat{p}_c)\!\left(\tfrac{1}{n_1}+\tfrac{1}{n_2}\right)}\) where \(\hat{p}_c=\dfrac{X_1+X_2}{n_1+n_2}\)	▼
What it means CI: use each sample's \(\hat{p}\) separately — you are not assuming equal proportions. Significance test (\(H_0:p_1=p_2\)): pool both samples into a combined \(\hat{p}_c\), because under \(H_0\) you assume the true proportions are equal. Worked Example — Pooled Test \(X_1=30,\,n_1=100;\;X_2=45,\,n_2=150\) \(\hat{p}_c=75/250=0.30\) \(\text{SE}=\sqrt{0.30(0.70)(1/100+1/150)}\approx\) 0.056

Sampling Distributions for Means

Random Variable	Parameters of Sampling Distribution	Standard Error* of Sample Statistic
One population: \(\bar{x}\)	\(\mu_{\bar{x}}=\mu\) \(\sigma_{\bar{x}}=\dfrac{\sigma}{\sqrt{n}}\)	\(s_{\bar{x}}=\dfrac{s}{\sqrt{n}}\)	▼
What it means The sampling distribution of \(\bar{x}\) is centered at \(\mu\) with SE \(=s/\sqrt{n}\). Larger \(n\) → smaller SE → more precise estimate. The Central Limit Theorem guarantees \(\bar{x}\) is approximately Normal for large \(n\), regardless of population shape. Worked Example — One-Sample \(t\)-test \(n=25,\;\bar{x}=78,\;s=12\) \(\text{SE}=12/\sqrt{25}=12/5=2.4\) Test \(H_0:\mu=75\): \(t=(78-75)/2.4=\) 1.25 (df=24)
Two populations: \(\bar{x}_1-\bar{x}_2\)	\(\mu=\mu_1-\mu_2\) \(\sigma=\sqrt{\dfrac{\sigma_1^2}{n_1}+\dfrac{\sigma_2^2}{n_2}}\)	\(s_{\bar{x}_1-\bar{x}_2}=\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}\)	▼
What it means Combines variability from two independent samples. Because the samples are independent, variances add: \(s_1^2/n_1+s_2^2/n_2\). Take the square root for the SE. Used in two-sample \(t\)-tests and confidence intervals for \(\mu_1-\mu_2\). Worked Example A: \(\bar{x}_1=84,\,s_1=10,\,n_1=30\) \| B: \(\bar{x}_2=79,\,s_2=8,\,n_2=40\) \(\text{SE}=\sqrt{100/30+64/40}=\sqrt{4.93}\approx 2.22\) \(t=(84-79)/2.22\approx\) 2.25

Sampling Distributions for Simple Linear Regression

Random Variable	Parameters of Sampling Distribution	Standard Error* of Sample Statistic
For slope: \(b\)	\(\mu_b=\beta\) \(\sigma_b=\dfrac{\sigma}{s_x\sqrt{n}}\)	\(s_b=\dfrac{s}{s_x\sqrt{n}}\) where \(\;s=\sqrt{\dfrac{\displaystyle\sum(y_i-\hat{y}_i)^2}{n-2}}\) and \(\;s_x=\sqrt{\dfrac{\displaystyle\sum(x_i-\bar{x})^2}{n-1}}\)	▼
What it means \(s\) is the residual standard error — how much the \(y\)-values scatter around the regression line. More spread in \(x\) (larger \(s_x\)) → more precise slope estimate → smaller \(s_b\). Test \(H_0:\beta=0\) using \(t=b/s_b\) with df \(=n-2\). Worked Example \(n=20,\;b=2.5,\;s=4.1,\;s_x=3.0\) \(s_b=4.1/(3.0\sqrt{20})=4.1/13.42\approx 0.306\) \(t=2.5/0.306\approx\) 8.17 (df=18; strong evidence \(\beta\neq 0\))

*Standard deviation is a measurement of variability from the theoretical population. Standard error is the estimate of the standard deviation. If the standard deviation of the statistic is assumed to be known, then the standard deviation should be used instead of the standard error.

Tables A / B / C · Interactive Reference

Table A — Standard Normal (z)

Table B — t Distribution

Table C — χ² Distribution

Z Calculator

z value

Tail

Enter a z value above.

Table A — Standard Normal Probabilities (P(Z < z))

t Table Lookup

Tail prob p

Select tail probability and df above.

Table B — t Critical Values (nearest df row highlighted)

Values are exact for listed df rows. When your df falls between rows, the nearest row is highlighted and the result shown is an approximation — use a calculator for precision.

χ² Table Lookup

Tail prob p

Select tail probability and df above.

Table C — χ² Critical Values (nearest df row highlighted)

Values are exact for listed df rows. When your df falls between rows, the nearest row is highlighted and the result shown is an approximation — use a calculator for precision.