Steg-för-steg-instruktioner
Gather and Combine Data for Ranking
First, compile all observations from all groups into a single list. This is essential for ranking them collectively. Note which group each observation belongs to. For our example: | Score | Group | |-------|-------| | 20 | A | | 25 | A | | 22 | A | | 30 | B | | 35 | B | | 32 | B | | 15 | C | | 18 | C | | 17 | C | Determine the total number of observations (N). In this case, N = 3 (Group A) + 3 (Group B) + 3 (Group C) = 9.
Assign Ranks to All Observations
Sort all observations from the smallest to the largest, regardless of their group. Then, assign ranks starting from 1 for the smallest value. If there are tied values, assign each tied value the average of the ranks they would have received. | Sorted Score | Original Group | Rank | |--------------|----------------|------| | 15 | C | 1 | | 17 | C | 2 | | 18 | C | 3 | | 20 | A | 4 | | 22 | A | 5 | | 25 | A | 6 | | 30 | B | 7 | | 32 | B | 8 | | 35 | B | 9 |
Calculate the Sum of Ranks for Each Group
Now, separate the ranks back into their original groups and sum the ranks for each group. This gives you R_i for each group. * **Group A Ranks:** 4, 5, 6 * **R_A** = 4 + 5 + 6 = 15 * **n_A** = 3 * **Group B Ranks:** 7, 8, 9 * **R_B** = 7 + 8 + 9 = 24 * **n_B** = 3 * **Group C Ranks:** 1, 2, 3 * **R_C** = 1 + 2 + 3 = 6 * **n_C** = 3
Apply the Kruskal-Wallis H Formula
Plug the values (N, n_i, R_i) into the Kruskal-Wallis H formula: H = [12 / (N * (N + 1))] * Σ [R_i^2 / n_i] - 3 * (N + 1) Substitute the values from our example: * N = 9 * n_A = 3, R_A = 15 * n_B = 3, R_B = 24 * n_C = 3, R_C = 6 H = [12 / (9 * (9 + 1))] * [(15^2 / 3) + (24^2 / 3) + (6^2 / 3)] - 3 * (9 + 1) H = [12 / (9 * 10)] * [(225 / 3) + (576 / 3) + (36 / 3)] - 3 * 10 H = [12 / 90] * [75 + 192 + 12] - 30 H = 0.13333 * [279] - 30 H = 37.19937 - 30 H = 7.19937 So, the calculated Kruskal-Wallis H statistic is approximately **7.20**.
Determine Statistical Significance
To determine if H is statistically significant, compare it to a critical value from a chi-squared (χ²) distribution table. The degrees of freedom (df) for the Kruskal-Wallis test are k - 1, where k is the number of groups. In our example, df = 3 - 1 = 2. For an alpha level (α) of 0.05 and df = 2, the critical chi-squared value is approximately **5.991**. * **Decision Rule:** If H > critical χ² value, reject the null hypothesis. * **Our Result:** Since H (7.20) > 5.991, we reject the null hypothesis. This means there is a statistically significant difference in exam scores among the three teaching methods.
Perform Post-Hoc Analysis (If H is Significant)
As the Kruskal-Wallis test is significant, it indicates that at least one group's median is different from the others. To identify *which* specific groups differ, you need to conduct post-hoc tests. Common post-hoc tests for Kruskal-Wallis include Dunn's test or Conover's test, often with adjustments for multiple comparisons (e.g., Bonferroni correction) to control the family-wise error rate. Manually performing post-hoc tests is complex and tedious, involving pairwise comparisons and rank sum calculations. For practical purposes, this step is almost always performed using statistical software or a dedicated calculator to ensure accuracy and appropriate p-value adjustments.
The Kruskal-Wallis H test is a non-parametric statistical method used to determine if there are statistically significant differences between the medians of three or more independent groups on a continuous or ordinal dependent variable. It serves as a non-parametric alternative to the one-way analysis of variance (ANOVA) when the assumptions for ANOVA (e.g., normality of data) are not met.
Prerequisites for the Kruskal-Wallis H Test
Before proceeding with the calculation, ensure your data meets the following criteria:
- Independent Groups: You have three or more independent groups.
- Ordinal or Continuous Data: The dependent variable is measured on an ordinal scale or a continuous scale.
- Random Sampling: Observations within each group are a random sample from the population.
- Homogeneity of Variance (Shape): While it doesn't assume normality, it does assume that the distributions of the groups have similar shapes. If the shapes are very different, a significant result might indicate differences in shape rather than median.
The Kruskal-Wallis H Formula
The formula for the Kruskal-Wallis H statistic is:
H = [12 / (N * (N + 1))] * Σ [R_i^2 / n_i] - 3 * (N + 1)
Where:
- N = Total number of observations across all groups.
- k = Number of groups.
- n_i = Number of observations in group i.
- R_i = Sum of ranks for group i.
- Σ = Summation across all k groups.
Worked Example: Comparing Exam Scores Across Three Teaching Methods
Imagine a researcher wants to compare the effectiveness of three different teaching methods (Method A, Method B, Method C) on student exam scores. They collect the following scores from three independent groups of students:
- Method A: 20, 25, 22
- Method B: 30, 35, 32
- Method C: 15, 18, 17
Goal: Determine if there's a significant difference in exam scores among the three teaching methods using a significance level (alpha) of 0.05.
Common Pitfalls to Avoid
- Incorrect Ranking of Ties: If two or more observations have the same value, they are assigned the average of the ranks they would have received. For example, if two values tie for the 4th and 5th ranks, both receive a rank of (4+5)/2 = 4.5. Failing to do this correctly will skew your results.
- Misinterpreting H: A significant H value indicates that at least one group median is different from the others, but it does not tell you which specific groups differ. You need post-hoc tests for that.
- Forgetting Post-Hoc Tests: If your Kruskal-Wallis test is significant, you must perform post-hoc tests (e.g., Dunn's test with Bonferroni correction) to identify the specific group differences. Without them, your conclusion is incomplete.
- Applying to Dependent Samples: The Kruskal-Wallis test is for independent samples. For dependent samples (e.g., repeated measures), Friedman's ANOVA is the appropriate non-parametric test.
When to Use a Calculator for Convenience
While understanding the manual calculation is crucial for conceptual grasp, using a calculator or statistical software becomes highly advantageous in several scenarios:
- Large Datasets: Manually ranking and summing ranks for hundreds or thousands of observations is time-consuming and prone to error.
- Complex Tie Handling: If your data contains numerous tied values, the formula for H needs a correction factor, which is cumbersome to calculate by hand. Calculators handle this automatically.
- Precise P-Values: Statistical software provides exact p-values, which are more precise than comparing your H statistic to a critical value from a chi-squared table.
- Automated Post-Hoc Analysis: Many statistical tools automatically perform and correct for multiple comparisons in post-hoc tests, saving significant effort and reducing the chance of error.