分步说明
Arrange Your Data in Ascending Order
First, arrange your dataset in ascending order. This is necessary to calculate the expected values under normality. For example, if your dataset is {12, 15, 18, 20, 22}, arrange it in ascending order as {12, 15, 18, 20, 22}.
Calculate the Mean and Standard Deviation
Next, calculate the mean (μ) and standard deviation (σ) of your dataset. The mean is calculated as the sum of all values divided by the number of values, while the standard deviation is calculated as the square root of the sum of squared differences from the mean divided by the number of values minus one.
Calculate the Expected Values Under Normality
Calculate the expected values under normality using the inverse cumulative distribution function (CDF) of the standard normal distribution. The expected values are calculated as μ + σ \* Φ^(-1)((i - 3/8) / (n + 1/4)), where i is the rank of the observation, n is the number of observations, and Φ^(-1) is the inverse CDF of the standard normal distribution.
Calculate the W Statistic
The W statistic is calculated as the sum of squared differences between the observed values and the expected values under normality, divided by the sum of squared differences between the observed values and the mean. The formula for the W statistic is W = (∑(x_i - E(x_i))^2) / (∑(x_i - μ)^2), where x_i is the observed value, E(x_i) is the expected value under normality, and μ is the mean.
Determine the p-Value
The p-value is determined using a table of critical values for the Shapiro-Wilk test or using software. The p-value indicates the probability of observing the W statistic (or a more extreme value) assuming that the data is normally distributed. If the p-value is less than your chosen significance level (usually 0.05), you reject the null hypothesis that the data is normally distributed.
Interpret the Results
Finally, interpret the results of the Shapiro-Wilk test. If the p-value is less than your chosen significance level, you conclude that the data is not normally distributed. Otherwise, you fail to reject the null hypothesis that the data is normally distributed. Note that the Shapiro-Wilk test is sensitive to sample size, and large samples may result in rejection of the null hypothesis even if the data is approximately normally distributed.
The Shapiro-Wilk test is a statistical method used to determine if a dataset is normally distributed. It is commonly used in hypothesis testing to ensure that the data meets the assumptions of parametric tests. In this guide, we will walk you through the steps to perform a Shapiro-Wilk test manually and provide a worked example with real numbers.
Introduction to the Shapiro-Wilk Test
The Shapiro-Wilk test is based on the W statistic, which measures the correlation between the observed values and the expected values under normality. The test also provides a p-value, which indicates the probability of observing the W statistic (or a more extreme value) assuming that the data is normally distributed.
Step-by-Step Calculation
To perform a Shapiro-Wilk test manually, follow these steps: