AP Statistics  ·  2026 Exam Reference
Formula & Table Reference
An interactive study tool — click any formula to see a plain-English explanation and worked example, and use the table calculators to look up critical values dynamically.
Click any formula row to expand  ·  Use the table calculators below for z, t, and χ² lookups
I.  Descriptive Statistics

Note: The actual exam reference sheet does not label individual formulas in this section — they appear without names.

Sample Mean \(\displaystyle\bar{x}=\frac{\displaystyle\sum x_i}{n}\)
What it means

Add up all the data values and divide by how many there are. The mean is the balancing point of the distribution. It is sensitive to outliers — a single extreme value can pull it far from the center of the bulk of the data.

Worked Example
Scores: 72, 85, 90, 88, 75  (\(n=5\))
\(\bar{x}=\dfrac{72+85+90+88+75}{5}=\dfrac{410}{5}=\) 82
Sample Standard Deviation \(\displaystyle s_x=\sqrt{\frac{\displaystyle\sum(x_i-\bar{x})^2}{n-1}}\)
What it means

Measures how spread out the data are around \(\bar{x}\). We divide by \(n-1\) rather than \(n\) to produce an unbiased estimate of the population standard deviation \(\sigma\). Larger \(s_x\) means more variability in the data.

Worked Example
Values: 2, 4, 4, 4, 5, 5, 7, 9  (\(n=8,\;\bar{x}=5\))
\(\sum(x_i-\bar{x})^2 = 9{+}1{+}1{+}1{+}0{+}0{+}4{+}16 = 32\)
\(s_x = \sqrt{32/7}\approx\) 2.14
Least-Squares Regression Line \(\hat{y} = a + bx\)
What it means

\(\hat{y}\) ("y-hat") is the predicted response for a given \(x\). The line minimizes the sum of squared vertical residuals \(\sum(y_i-\hat{y}_i)^2\). \(b\) is the slope and \(a\) is the \(y\)-intercept. Use \(\hat{y}\) only for interpolation within the range of the data.

Worked Example
Predicting test score from hours studied: \(\hat{y}=50+8x\)
For \(x=4\) hours: \(\hat{y}=50+8(4)=\) 82 points
Slope of the LSRL \(\displaystyle b=r\frac{s_y}{s_x}\)
What it means

The slope is the correlation coefficient scaled by the ratio of the standard deviations. For every 1-unit increase in \(x\), \(\hat{y}\) changes by \(b\) units. The sign of \(b\) always matches the sign of \(r\).

Worked Example
\(r=0.85,\;s_y=10,\;s_x=4\)
\(b=0.85\times\tfrac{10}{4}=0.85\times 2.5=\) 2.125
Each extra unit of \(x\) predicts +2.125 units of \(\hat{y}\).
Correlation Coefficient \(\displaystyle r=\frac{1}{n-1}\sum\!\left(\frac{x_i-\bar{x}}{s_x}\right)\!\!\left(\frac{y_i-\bar{y}}{s_y}\right)\)
What it means

Measures the strength and direction of a linear association. Always \(-1\le r\le 1\). Values near \(\pm 1\) are strong; near 0 are weak. \(r\) only measures linear association — a perfectly curved relationship can have \(r\approx 0\).

Worked Example (conceptual)
When tall people also tend to be heavier, both \(z\)-scores are positive together → their products are positive → \(r\) is positive. A calculator gives the exact value. Example result: \(r=+0.87\) (strong positive linear association).
II.  Probability and Distributions
\(P(A\cup B)=P(A)+P(B)-P(A\cap B)\)
What it means

Probability that \(A\) or \(B\) (or both) occur. We subtract the overlap \(P(A\cap B)\) to avoid double-counting it. If \(A\) and \(B\) are mutually exclusive, \(P(A\cap B)=0\) and the rule simplifies to \(P(A)+P(B)\).

Worked Example
Standard deck: \(P(\heartsuit)=\tfrac{13}{52}\), \(P(\text{face})=\tfrac{12}{52}\), \(P(\heartsuit\cap\text{face})=\tfrac{3}{52}\)
\(P(\heartsuit\cup\text{face})=\tfrac{13+12-3}{52}=\tfrac{22}{52}\approx\) 0.423
\(\displaystyle P(A\mid B)=\frac{P(A\cap B)}{P(B)}\)
What it means

The probability of \(A\) given that \(B\) has occurred. We restrict the sample space to outcomes where \(B\) happened, then find what fraction also include \(A\). Events are independent when \(P(A\mid B)=P(A)\).

Worked Example
\(P(\text{studied})=0.90\), \(P(\text{pass}\cap\text{studied})=0.63\)
\(P(\text{pass}\mid\text{studied})=0.63/0.90=\) 0.70
Probability Distribution Mean Standard Deviation
Discrete random variable, \(X\)
\(\mu_X=E(X)={\displaystyle\sum} x_i\cdot P(x_i)\)
\(\mu_X\) \(\sigma_X=\sqrt{{\displaystyle\sum}(x_i-\mu_X)^2\cdot P(x_i)}\)
Mean — what it means

The expected value: a probability-weighted average of all possible outcomes. Multiply each value by its probability and sum. Think of it as the long-run average if the random process were repeated many, many times.

Worked Example — Fair Die
\(X=\) face value; each \(P(x)=\tfrac{1}{6}\)
\(\mu_X=\tfrac{1}{6}(1{+}2{+}3{+}4{+}5{+}6)=\tfrac{21}{6}=\) 3.5
\(\sigma_X=\sqrt{\tfrac{1}{6}[(1{-}3.5)^2+\cdots+(6{-}3.5)^2]}\approx\) 1.71
If \(X\) has a binomial distribution with parameters \(n\) and \(p\), then:
\(P(X=x)=\dbinom{n}{x}p^x(1-p)^{n-x}\)
\(x=0,1,2,\ldots,n\)
\(np\) \(\sqrt{np(1-p)}\)
What it means

Use when you have \(n\) independent trials, each with probability \(p\) of success, and you want \(P(\text{exactly }x\text{ successes})\). \(\binom{n}{x}\) counts the arrangements; \(p^x(1-p)^{n-x}\) is the probability of each arrangement. Conditions: Binary, Independent, Number fixed, Same \(p\).

Worked Example
5 T/F questions; random guess (\(p=0.5\)). \(P(X=3)\)?
\(\binom{5}{3}(0.5)^3(0.5)^2=10\times 0.125\times 0.25=\) 0.3125
\(\mu=5(0.5)=2.5\);  \(\sigma=\sqrt{5(.5)(.5)}\approx 1.12\)
If \(X\) has a geometric distribution with parameter \(p\), then:
\(P(X=x)=(1-p)^{x-1}p\)
\(x=1,2,3,\ldots\)
\(\dfrac{1}{p}\) \(\dfrac{\sqrt{1-p}}{p}\)
What it means

Use when you repeat independent trials until the first success. \(X\) = the trial number of the first success. There is no fixed \(n\) — you stop when you succeed. \((1-p)^{x-1}\) is the probability of \(x-1\) failures first.

Worked Example
Free-throw shooter, \(p=0.70\). \(P(\text{first make on 3rd attempt})\)?
\(P(X=3)=(0.30)^2(0.70)=0.09\times 0.70=\) 0.063
On average: \(\mu=1/0.70\approx\) 1.43 attempts
III.  Sampling Distributions and Inferential Statistics
Standardized Test Statistic \(\displaystyle\text{test statistic}=\frac{\text{statistic}-\text{parameter}}{\text{standard error of the statistic}}\)
What it means

The skeleton of every significance test. Measures how many standard errors the sample statistic lies from the hypothesized parameter value. A large absolute value means the data are unlikely under \(H_0\), providing evidence against it.

Worked Example
\(H_0:\mu=100\). Sample: \(\bar{x}=107\), \(\text{SE}=3.5\)
\(t=(107-100)/3.5=7/3.5=\) 2.0
The sample mean is 2 SEs above \(\mu_0=100\).
Confidence Interval \(\text{statistic}\pm(\text{critical value})(\text{SE of statistic})\)
What it means

An interval of plausible values for the true parameter. The critical value (\(z^*\) or \(t^*\)) controls the confidence level. Common \(z^*\) values: 1.645 (90%), 1.960 (95%), 2.576 (99%). The margin of error is \(\text{critical value}\times\text{SE}\).

Worked Example — 95% CI for \(\mu\)
\(\bar{x}=52,\;\text{SE}=2.4,\;z^*=1.960\)
\(52\pm 1.960(2.4)=52\pm 4.70=\) (47.30, 56.70)
We are 95% confident the true mean is in this interval.
Chi-Square Statistic \(\displaystyle\chi^2=\sum\frac{(\text{observed}-\text{expected})^2}{\text{expected}}\)
What it means

Measures how far observed counts deviate from expected counts under \(H_0\). Each cell contributes one term; larger \(\chi^2\) = more evidence against \(H_0\). Used for goodness-of-fit and tests of independence/homogeneity. Always \(\chi^2\ge 0\); the test is always right-tailed.

Worked Example
Fair die, 60 rolls. Expected: 10 per face. Observed: 8,12,9,11,7,13
\(\chi^2=\tfrac{4}{10}+\tfrac{4}{10}+\tfrac{1}{10}+\tfrac{1}{10}+\tfrac{9}{10}+\tfrac{9}{10}=\) 2.8  (df=5; fail to reject \(H_0\))
Sampling Distributions for Proportions
Random Variable Parameters of Sampling Distribution Standard Error* of Sample Statistic
One population:
\(\hat{p}\)
\(\mu_{\hat{p}}=p\)
\(\sigma_{\hat{p}}=\sqrt{\dfrac{p(1-p)}{n}}\)
\(s_{\hat{p}}=\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\)
What it means

For a significance test, use \(\sigma_{\hat{p}}\) with \(p_0\) from \(H_0\) (you're assuming \(p\) is known). For a confidence interval, use \(s_{\hat{p}}\) with \(\hat{p}\) in place of \(p\) (you don't know \(p\), so you estimate it).

Worked Example — 95% CI
\(n=400,\;\hat{p}=0.62\)
\(s_{\hat{p}}=\sqrt{0.62(0.38)/400}\approx 0.0242\)
\(0.62\pm 1.96(0.0242)=\) (0.572, 0.668)
Two populations:
\(\hat{p}_1-\hat{p}_2\)
\(\mu=p_1-p_2\)
\(\sigma=\sqrt{\dfrac{p_1(1-p_1)}{n_1}+\dfrac{p_2(1-p_2)}{n_2}}\)
CI: \(\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\dfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)

Test (\(p_1=p_2\) assumed):
\(\sqrt{\hat{p}_c(1-\hat{p}_c)\!\left(\tfrac{1}{n_1}+\tfrac{1}{n_2}\right)}\)
where \(\hat{p}_c=\dfrac{X_1+X_2}{n_1+n_2}\)
What it means

CI: use each sample's \(\hat{p}\) separately — you are not assuming equal proportions.

Significance test (\(H_0:p_1=p_2\)): pool both samples into a combined \(\hat{p}_c\), because under \(H_0\) you assume the true proportions are equal.

Worked Example — Pooled Test
\(X_1=30,\,n_1=100;\;X_2=45,\,n_2=150\)
\(\hat{p}_c=75/250=0.30\)
\(\text{SE}=\sqrt{0.30(0.70)(1/100+1/150)}\approx\) 0.056
Sampling Distributions for Means
Random Variable Parameters of Sampling Distribution Standard Error* of Sample Statistic
One population:
\(\bar{x}\)
\(\mu_{\bar{x}}=\mu\)
\(\sigma_{\bar{x}}=\dfrac{\sigma}{\sqrt{n}}\)
\(s_{\bar{x}}=\dfrac{s}{\sqrt{n}}\)
What it means

The sampling distribution of \(\bar{x}\) is centered at \(\mu\) with SE \(=s/\sqrt{n}\). Larger \(n\) → smaller SE → more precise estimate. The Central Limit Theorem guarantees \(\bar{x}\) is approximately Normal for large \(n\), regardless of population shape.

Worked Example — One-Sample \(t\)-test
\(n=25,\;\bar{x}=78,\;s=12\)
\(\text{SE}=12/\sqrt{25}=12/5=2.4\)
Test \(H_0:\mu=75\): \(t=(78-75)/2.4=\) 1.25  (df=24)
Two populations:
\(\bar{x}_1-\bar{x}_2\)
\(\mu=\mu_1-\mu_2\)
\(\sigma=\sqrt{\dfrac{\sigma_1^2}{n_1}+\dfrac{\sigma_2^2}{n_2}}\)
\(s_{\bar{x}_1-\bar{x}_2}=\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}\)
What it means

Combines variability from two independent samples. Because the samples are independent, variances add: \(s_1^2/n_1+s_2^2/n_2\). Take the square root for the SE. Used in two-sample \(t\)-tests and confidence intervals for \(\mu_1-\mu_2\).

Worked Example
A: \(\bar{x}_1=84,\,s_1=10,\,n_1=30\)  |  B: \(\bar{x}_2=79,\,s_2=8,\,n_2=40\)
\(\text{SE}=\sqrt{100/30+64/40}=\sqrt{4.93}\approx 2.22\)
\(t=(84-79)/2.22\approx\) 2.25
Sampling Distributions for Simple Linear Regression
Random Variable Parameters of Sampling Distribution Standard Error* of Sample Statistic
For slope:
\(b\)
\(\mu_b=\beta\)
\(\sigma_b=\dfrac{\sigma}{s_x\sqrt{n}}\)
\(s_b=\dfrac{s}{s_x\sqrt{n}}\)

where \(\;s=\sqrt{\dfrac{\displaystyle\sum(y_i-\hat{y}_i)^2}{n-2}}\)

and \(\;s_x=\sqrt{\dfrac{\displaystyle\sum(x_i-\bar{x})^2}{n-1}}\)
What it means

\(s\) is the residual standard error — how much the \(y\)-values scatter around the regression line. More spread in \(x\) (larger \(s_x\)) → more precise slope estimate → smaller \(s_b\). Test \(H_0:\beta=0\) using \(t=b/s_b\) with df \(=n-2\).

Worked Example
\(n=20,\;b=2.5,\;s=4.1,\;s_x=3.0\)
\(s_b=4.1/(3.0\sqrt{20})=4.1/13.42\approx 0.306\)
\(t=2.5/0.306\approx\) 8.17  (df=18; strong evidence \(\beta\neq 0\))

*Standard deviation is a measurement of variability from the theoretical population. Standard error is the estimate of the standard deviation. If the standard deviation of the statistic is assumed to be known, then the standard deviation should be used instead of the standard error.

Tables A / B / C  ·  Interactive Reference
Table A  — Standard Normal (z)
Table B  — t Distribution
Table C  — χ² Distribution

Z Calculator

Enter a z value above.

Table A — Standard Normal Probabilities  (P(Z < z))

t Table Lookup

Select tail probability and df above.

Table B — t Critical Values  (nearest df row highlighted)

Values are exact for listed df rows. When your df falls between rows, the nearest row is highlighted and the result shown is an approximation — use a calculator for precision.

χ² Table Lookup

Select tail probability and df above.

Table C — χ² Critical Values  (nearest df row highlighted)

Values are exact for listed df rows. When your df falls between rows, the nearest row is highlighted and the result shown is an approximation — use a calculator for precision.