A t-test is a statistical test that checks whether the average (mean) of one group is meaningfully different from another group's average — or from a fixed number — rather than the difference being down to random chance. You'd use it when you have one numeric outcome (like test scores, reaction times, or blood pressure) and you want to compare two groups or two time points. If you're staring at two columns of numbers wondering whether the gap between them is "real," a t-test is almost certainly the tool you need.
Key Takeaways
- A t-test compares two means to decide whether their difference is statistically significant or likely just noise.
- There are three types: one-sample (one group vs. a fixed value), independent-samples (two separate groups), and paired-samples (the same people measured twice).
- You need a continuous outcome and two groups (or fewer). For three or more groups, use ANOVA instead.
- A significant result is usually p < .05, meaning there's less than a 5% chance the difference is due to random variation.
- Always report an effect size (Cohen's d) alongside the p-value — significance alone doesn't tell you how big the difference is.
What Does a T-Test Actually Do?
A t-test asks a simple question: is the difference between two averages big enough to take seriously? Two groups will almost never have identical means, even if they come from the same underlying population — random variation guarantees some gap. The t-test weighs the size of that gap against how much your data naturally bounces around (the spread). A large difference with tight, consistent data produces a big t value and a small p-value. A small difference with wildly scattered data produces a small t value and a large p-value.
Think of it as a signal-to-noise ratio: the difference between means is the signal, the variability in your data is the noise. The bigger the signal relative to the noise, the more confident you can be the effect is real.
The Three Types of T-Test (and When to Use Each)
Choosing the right t-test is the step people get wrong most often. Here's how to tell them apart:
- One-sample t-test — compares your group's mean against a known or hypothesised value. Example: Is the average IQ in your sample different from the population norm of 100?
- Independent-samples t-test — compares the means of two different, unrelated groups. Example: Do students taught with Method A score differently from students taught with Method B?
- Paired-samples t-test — compares two measurements from the same people (or matched pairs). Example: Do patients' anxiety scores change from before treatment to after?
The key distinction between the last two: independent = two separate sets of people; paired = the same people measured twice.
When Should You Use a T-Test?
Use a t-test when all of the following are true:
- Your outcome variable is continuous (scores, measurements, time — not categories).
- You are comparing two groups or fewer (one group vs. a value, or two groups against each other).
- Your data is roughly normally distributed within each group.
- For independent samples, the two groups have roughly equal variances (or you use Welch's correction, which handles unequal spread automatically).
If you have three or more groups, don't run multiple t-tests — that inflates your false-positive rate. Use a one-way ANOVA instead. If your data is heavily skewed or ordinal (like Likert ratings), a Mann-Whitney U test is the safer non-parametric alternative — see our guide on Mann-Whitney vs the t-test if you're unsure which fits your data.
A Worked Example With Real Numbers
Let's say you're testing whether a new study-skills workshop improves exam performance. You randomly assign 60 students to two groups: 30 attend the workshop, 30 don't. Afterward, everyone sits the same exam (scored out of 100).
- Workshop group: M = 74.2, SD = 8.1
- Control group: M = 68.5, SD = 9.3
Because these are two separate groups of people, you run an independent-samples t-test. The result comes out as:
t(58) = 2.53, p = .014, d = 0.65
Here's what each piece means:
- t(58) = 2.53 — the t statistic is 2.53, with 58 degrees of freedom (roughly total sample size minus 2). Bigger t = a stronger signal relative to noise.
- p = .014 — there's about a 1.4% probability of seeing a difference this large if the workshop had no real effect. Since .014 is below the conventional .05 threshold, the result is statistically significant.
- d = 0.65 — Cohen's d of 0.65 is a medium-to-large effect. The workshop didn't just produce a "detectable" difference; it produced a practically meaningful one (roughly two-thirds of a standard deviation).
Interpretation: Students who attended the workshop scored significantly higher on the exam than those who didn't, and the effect was substantial. You'd only reach this conclusion because both the p-value (is it real?) and the effect size (is it big?) pointed the same way.
Independent vs. Paired T-Test: Key Differences
| Feature | Independent-samples t-test | Paired-samples t-test |
|---|---|---|
| Who's compared | Two different groups | Same people, two time points |
| Typical use | Treatment vs. control | Before vs. after |
| Example | Method A vs. Method B students | Anxiety pre- vs. post-therapy |
| Data structure | Two separate columns, unrelated | Two columns, one row per person |
| Degrees of freedom | n₁ + n₂ − 2 | n pairs − 1 |
| Handles individual differences | No | Yes (compares each person to themselves) |
The paired test is often more powerful because it controls for individual differences — each person acts as their own baseline.
What Counts as a Significant Result?
A result is conventionally considered statistically significant when p < .05, meaning there's less than a 5% chance the observed difference happened by random luck. But significance is not the whole story. With a very large sample, even a trivially small difference can hit p < .05, which is why APA 7 requires you to report an effect size (Cohen's d for t-tests). As a rough guide: d = 0.2 is small, 0.5 is medium, and 0.8 is large. Always report the confidence interval too — it shows the plausible range for the true difference.
How to Report a T-Test in APA 7 Format
APA 7 wants the test statistic, degrees of freedom, exact p-value, and an effect size. A complete write-up looks like this:
An independent-samples t-test showed that students who attended the workshop (M = 74.2, SD = 8.1) scored significantly higher than the control group (M = 68.5, SD = 9.3), t(58) = 2.53, p = .014, d = 0.65, 95% CI [1.19, 10.21].
Note the formatting rules: t, p, M, SD, and d are all italicised, and p-values drop the leading zero (.014, not 0.014).
Getting this notation exactly right by hand is fiddly, which is where a tool helps. StatRyx picks the correct t-test for your data, runs it, and generates the APA 7 sentence — italics, confidence interval, and effect size included — so you can paste it straight into your thesis.
Stop calculating this by hand — run it free in StatRyx → Try StatRyx
Frequently Asked Questions
What is the difference between a t-test and an ANOVA?
A t-test compares the means of two groups (or one group against a value), while an ANOVA compares the means of three or more groups at once. Running